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SECOND-ORDER FACTORS 


L. L. THURSTONE 
THE UNIVERSITY OF CHICAGO 


Second-order factors are defined and illustrated in terms of a 
literal notation, a physical example, a diagrammatic representation, 
a geometrical example, and the matrix equations relating the first- 
order and second-order domains. Both kinds of factors are discussed 
as parameters which may be not only descriptive of the individual 
objects in a statistical population but also descriptive of the re- 
strictive conditions under which the objects were generated or se- 
lected. Second-order factors may be of significance in reconciling 
the several theories of intelligence. This paper is concerned with 
test configurations that show simple structure. If such a structure 
is not revealed, then the second-order domain is indeterminate. 


1. First-Order and Second-Order Factors 

Most of the work that has been done so far in the development of 
factorial theory has been concerned with the factors obtained from 
test correlations with or without rotation of axes for the selection of 
a suitable reference frame. Factors that are obtained from the test 
correlation will be called first-order factors whether they are selected 
so as to be orthogonal or oblique. We shall now consider the factors 
that may be determined from the correlations of the first-order fac- 
tors. Factors that are obtained from the correlations of the first-order 
factors will be called second-order factors. Factors of this type seem 
to be of fundamental significance in the interpretation of correlated 
variables.* 

Analysis of second-order factors and their relations to those of 
first-order can be presented in several different ways. We shall de- 
scribe these two factorial domains in terms of a literal notation, a 
physical example, a diagrammatic representation, a geometrical ex- 
ample, and the matrix equations relating the two domains. 

Consider first a reduced correlation matrix for the tests whose 
rank is, say, five. The factoring of this correlation matrix determines 
five arbitrary orthogonal unit reference vectors which may be denoted 
I, II, III, IV, and V. This orthogonal reference frame is arbitrary in 


* This study of second-order factors is one of a series of investigations in the 
development of multiple factor analysis and applications to the study of primary 
mental abilities. This investigation has been supported by a research grant from 
the Carnegie Corporation of New York. The Psychometric Laboratory has also 
had support from the Social Science Research Committee of The University of 


Chicago. 
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the sense that it is defined by the method of factoring which happens 
to be used. This reference frame will be regarded as fixed and ail 
other vectors will be defined in terms of this fixed orthogonal frame, 
which is designated by the subscript m. Let it be assumed that a com- 
plete simple structure can be found in the test configuration and let 
the corresponding primary vectors be denoted A, B, C, D, and E. 
(These are ordinarily denoted T,, T;, etc.) We shall assume that 
these primary traits are correlated in the experimental population. 
Then the primary vectors in the test configuration will be separated 
by acute angles whose cosines are the correlations between the pri- 
mary traits in the particular group of subjects studied. Let these 
correlations be listed in a new correlation matrix of order 5 X 5 show- 
ing the correlations between primary factors. This correlation matrix 
defines the second-order domain just as the correlation matrix for the 
tests defines the first-order domain. 

The simplest case is that in which the five primary factors are 
uncorrelated, in which case their correlation matrix is a unit matrix 
so that an analysis of a second-order domain is not immediately in- 
dicated. Next would be the case in which the reduced correlation ma- 
trix for the primary factors A, B, C, D, and E is of unit rank. There 
are two types of interpretation for such a situation. The correlations 
between the primary factors in a particular experimental population 
may be due to conditions of selection of the subjects, and in this case 
the correlations would be of no more theoretical importance than the 
conditions of selection of the subjects. If, on the other hand, the five 
primary functions A, B, C, D, and E actually do have some parameter 
in common, then one would expect their intercorrelations to be of 
unit rank for different experimental groups of subjects that are se- 
lected in different ways. In other words, the mere fact that a set of 
variables, or a set of factors, are correlated does not imply any scien- 
tific obligation to find “the” factors that account for the correlations 
because the factors, if found, might turn out to be as incidental in 
significance as the conditions by which the subjects happened to be 
selected. On the other hand, the fact that correlations between vari- 
ables, or between factors, can be caused by scientifically trivial cir- 
cumstances does not guarantee that all correlations between variables 
are of trivial significance. If the correlations between the five pri- 
mary factors in the present example should turn out to be of unit 
rank, then this circumstance merits a closer look because such a sim- 
plification would not often happen by chance. If the correlations be- 
tween the primary factors should turn out to be of unit rank for sev- 
eral different experimental groups, then we should have an obligation 
to ascertain the cause which must transcend the selective conditions. 
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In order to avoid misunderstanding, perhaps it should be re- 
marked that in factor analysis we are using the term parameter in its 
universal meaning in science. A parameter is one of the measure- 
ments that are used for describing or defining an object or event. In 
statistical theory the term parameter is frequently used in a more 
restricted sense as descriptive of the universe as contrasted with a 
statistic which is the corresponding measurement on a sample. We 
are not using the term in this restricted sense. 

Let it be assumed that the five primary factors do have a para- 
meter p in common. Then the five primaries could be expressed in 
the form 


A=f(p,@), 
B=f(p,b), 
C=f(p,c), 
D=f(p,d), 
E=f(p,e), 


where each primary function is defined in terms of a parameter such 
as a, b, c, d, or @, which is unique to itself and also in terms of an- 
other parameter p, which it shares with the functions that define the 
other primaries. If there should happen to be conspicuous correlation 
between the parameters a, b, c, d, and e in the particular group of 
subjects, then the unit rank of the second-order domain would be dis- 
turbed. If the correlations of the primaries show unit rank, then, in 
addition to the parameters a, b, c, d, and e, a second-order parameter 
or factor p can be postulated. 

It should be noted that we now have six parameters, namely, 
a, b, c, d, e, and p, and since the rank of the test correlations is five, 
it follows that these six parameters are linearly dependent. In fact, 
the parameter p is now a linear combination of the other five para- 
meters. We can express these relations by a set of parameters such 
as A, B, C, D, E, and p, in which p is a linear combination of the five 
primary parameters. The five primaries are parameters descriptive 
of the first-order domain, and the parameter or factor p is descrip- 
tive of the second-order domain, which is here of unit rank. The sec- 
ond-order parameter is a linear combination of the five primaries that 
are defined by the original test correlations. If some degree of con- 
sistency can be found for these parameters for different groups of 
subjects, then all of these parameters should represent some aspects 
of the underlying physical and mental functions. 

Consider next a set of correlated primaries A, B, C, D, and E in 
which the parameter p appears in the first order as in the following 


example: 
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A=f(p,a@), 
B=f(p,b), 
C=f(p,¢c), 
D=f(p,da), 
E=f(p). 


The rank of the reduced correlation matrix of the tests would now be 
five. The five primaries listed above would be correlated and of unit 
rank. The second-order factor p would be determined from the corre- 
lations of the primaries. In this case the communality for the pri- 
mary factor E would be near unity, thus showing that its total vari- 
ance is common to the second-order factor » , hence the primary fac- 
tor E of the first order and the factor p of the second order would be 
identical. The presence, or absence, of the primary E could be deter- 
mined by including, or excluding, a few tests in the battery. We see, 
therefore, that the appearance of a factor in the first order or in the 
second order may depend on the battery of measurements taken ; hence 
a factor should not be considered as intrinsically different because it 
appears in the second order. This circumstance can be determined by 
the selection of the test battery. 

On the other hand, a parameter which always appears in the 
measurements in association with some other function would not ap- 
pear as in the primary E and it would be discovered experimentally 
nearly always in the second order. Such a limitation could be intro- 
duced by the physical nature of the attribute which the factor repre- 
sents, so that in such a case the second order would represent some- 
thing fundamentally different from that of the first order. A single fac- 
tor study is not likely to reveal whether a second order parameter is 
fundamentally different from the parameters of first order or whether 
the differentiation is caused merely by the selection of the test battery. 

In the following example we have another combination of pri- 
maries, 


A=f(p,4@), 
B=f(p,b), 
C=f(p,¢), 
D=f(p,d), 
E=f(e). 


In this example the reduced rank of the test correlations would again 
be five. The correlations of the primaries would show unit rank for 
A, B, C, and D. The factor EF would be orthogonal to the rest of the 
system so that its row and column would have side correlations of 
zero. The correlations of primaries would not be of unit rank if we 
consider the whole table of order 5 X 5 but it would be of unit rank if 
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we consider only the 4 X 4 table for A, B, C, and D.. Relations of this 
kind can be found by inspection of the correlations of the primaries 
and they may be indicative of the underlying order in the domain that 
is being investigated. 

The principles of a second-order domain have been discussed here 
in terms of the simple case in which that domain is of unit rank so 
that there is only one general second-order factor. It should be evi- 
dent that the organization of the second-order parameters can be of 
any rank and complexity. For example, the rank may be higher than 
one, and the second-order factors may extend to all of the primaries 
or only to some of them. The possibility of third-order and higher- 
order factors must be recognized but their experimental identification 
is of increasing difficulty the higher the order because of the insta- 
bility of such a superstructure on practically feasible experimental 
data. The number of second-order factors that can be determined 
from a given number of linearly independent primary factors follows 
the same restrictive relations that govern the number of primary fac- 
tors that can be determined from a given number of tests. Thus, for 
example, it is not to be expected that three second-order factors will 
be determinate from only five primary factors for the same reason 
that three primary factors cannot be determined from five tests. Fur- 
thermore, it is entirely possible in the same data for the first-order 
domain to give clear interpretation of a set of primary factors and 
for the second-order domain to be indeterminate or ambiguous. 


2. The Box Example 

In order to illustrate the nature of first- and second-order factors, 
we shall make use of populations of simple objects or geometrical fig-\ 
ures and their measurable properties instead of dealing with these fac- 
tors merely as logical abstractions. We have used a population of 
rectangular boxes and their measurable attributes to illustrate the 
principles of correlated primary factors and we can use them also for 
the present discussion.* 

A random collection of rectangular boxes was represented by the 
three measurements length (x), width (y), and height (z). A list 
of measurements was prepared which could be made on each box, 
such as the diagonal of the front face, the area of the top surface, the 
length of a vertical edge, and so on. Each of these measurements rep- 
resented a test score and each box represented an individual member 
of the statistical population. The correlations between the measure- 
ments were computed and analyzed factorially as if we did not know 


*-Thurstone, L. L. Current issues in factor analysis, Psychol. Bull., 1940, 37, 
p. 222. 
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anything about the exact nature of each measurement, which was 
treated as a test score of unknown factorial composition. As has been 
shown previously, the analysis revealed three factors in the correla- 
tions for the particular set of measurements used. The configuration 
showed a complete simple structure and a set of primary vectors was 
determined by the configuration. These three primaries represented 
the three basic parameters in terms of which all the test measurements 
had been expressed. 

The three primary vectors were separated by acute angles whose 
cosines represented the correlations between the three basic paramet- 
ers that were used in setting up the box example. These three correla- 
tions could be assembled into a small correlation matrix of order 
3 X 8. The physical interpretation of the positive correlations was 
that large boxes tend to have all of their dimensions larger than small 
boxes. In other words, if one of the dimensions of a box shape is, say, 
six feet, the other dimensions of the box are not likely to be of the 
order of, say, two or three inches. The table of correlations of the 
three primary factors X, Y, Z, could be represented by a single com- 
mon factor. This factor would be a second-order factor. It would, 
no doubt, be interpreted as a size factor in the box example. If this 
second-order size factor were denoted s, we should have four para- 
meters for describing the box shapes, namely, the three dimensions, 
x, y, and z and the size factor s. These four parameters or factors 
would be linearly dependent because the rank of the correlation ma- 
trix of the tests was three, 

In the case of the box example, a size factor or parameter could 
be determined in the first-order if desired. For this purpose we could 
use the first centroid axis, the major principal axis, or the volume 
vector, all of which can be easily defined in the first-order system of 
test vectors. The four parameters so chosen would also be linearly 
dependent. If we wanted to use only three linearly independent para- 
meters including a size factor, that could be done in the first order 
by choosing, say, the two ratios x/y = 7, and x/z = r, as well as the 
volume vector v. These three factors would be linearly independent 
but they would be correlated. The latitude with which we can choose 
simplifying parameters for the box example is determined in part by 
the fact that three factors can nearly always be represented. .by a com- 
mon factor whereas this is not the case when the rank is higher than 
three. 


3. Diagrammatic Representation 
The relations between the first-order and the second-order do- 
mains can be represented diagrammatically as shown in Figures 1, 
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2, and 3. In Figure 1 we have a set of eight tests whose correlations 
are accounted for by five primary factors, A, B, C, D, and E, which 
are uncorrelated. The factor A, for example, is present in the com- 
mon factor variances of tests 1, 2, and 4. The primary factor E is 
present in the common factor variances of all the tests, and hence E 
would be called a general factor for the particular battery. Since it 
is orthogonal to all the other primary factors it may be called an 
orthogonal general factor of the first order. In order to determine the 
nature of the factor E' it would be necessary to study it in different 
test batteries so that one could predict with certainty when the factor 
would be present and when it would be absent from a test. Since the 
primary factors are here represented as uncorrelated, the matrix of 
correlations of the primary factors would be an identity matrix and 
there would be no immediate provocation to investigate a second- 
order domain. 

In Figure 2 we have represented a set of tests and five primary 
factors A, B, C, D, and E. (We are not here concerned as to whether 
the particular number of tests represented in this diagram is ade- 
quate for the determination of five primary factors. The purpose of 
these diagrams is merely to show the nature of the relations between 
the two domains.) The rank of the correlation matrix of the tests 
would here be five, which corresponds to the number of linearly in- 
dependent primary factors. In the present case we should find that 
the primary factors are themselves correlated. The matrix of corre- 
lations of these primaries would be of order 5 X 5 and it would be 
of unit rank. The correlations between the primary factors could 
therefore be accounted for by a single general second-order factor 
that is denoted G. If both the first-order and second-order factors 
were to be used for the description of the tests and the relations, we 
should have six parameters which would be linearly dependent be- 
cause the rank of the correlations of the tests is only five. In fact, 
the saturation of each test with the second-order factor G would be 
a linear combination of the saturations of the test with the five pri- 
maries of the first order. None of the primary factors are general 
factors in this figure. 

In Figure 3 we have a more complex relation in that the correla- 
tion matrix for the primary factors would be of rank two. One of the 
second-order factors is here shown to be common to all but one of the 
primary factors, one of the second-order factors is a factorial doub- 
let in that it represents additional correlation between the primaries 
B and D, and the primary factor A is orthogonal to the rest of the 
primaries so that it does not participate in the second-order domain. 
This diagram is drawn merely to illustrate the variations in complex- 
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ity that may be found in factorial studies. 

The two types of general factor here shown in Figures 1 and 2 
have some interesting differences. The general factor E of Figure 1 
is independent of the other primary factors while the general factor 
G in Figure 2 is present in all of the other factors. Hence we must 
conclude that a second-order general factor is a part of, and must 
participate in, the definition of the other factors while the orthogonal 
general factor E of Figure 1 is, by definition, independent of the other 
primary factors. It is evident, therefore, that a general second-order 
factor is likely to be of more fundamental significance for the domain 
in question than a general orthogonal first-order factor. An orthogo- 
nal general factor of the first order might operate in a test without 
any group factor whereas a second-order general factor would oper- 
ate, ordinarily, through the mechanism of some function that could 
be identified as a group factor, a primary factor, or a special ability. 

The factor patterns corresponding to the relations shown dia- 
grammatically in these figures are given in Tables 1, 2 and 3. Table 
1 shows the factor pattern for Figure 1. Here the orthogonal general 
factor E is identified by the fact that all entries of its column are 
filled. Table 2 shows the factor pattern for Figure 2. Here it is seen 
by the factor pattern that a group such as tests 1, 4, 5, and 7 have 
no primary factor in common and that hence their correlations would 
be determined only by the second-order general factor G. The deter- 
minant of the correlations for these four tests (the tetrad difference) 
would therefore vanish. The second-order factor matrix is also shown 
in this table with only one factor G to correspond to this example. 

The question might be raised whether both types of general fac- 
tor could be present in the same battery. That seems possible. In that 
case a simple structure could define the primary factors A, B, C, and 
D but not E in the particular battery of Figure 1. This factor could 
be assumed arbitrarily to be orthogonal to the other factors, but then 
the line GE of Figure 2 would be erased to correspond to the fact that 
E is orthogonal to the other factors. One or more second-order gen- 
eral factors could be found in the correlated primaries. If the corre- 
lations of A, B, C, and D were of unit rank, another alternative would 
be to set E in such a relation to the other primary vectors as to main- 
tain the unit rank with the second-order general factor. It might 
then be found that the vector F has non-vanishing projections on all 
the test vectors, in which case both types of general factor would be 
assumed to be a possible set of explanatory parameters for the bat- 
tery in question. It must be remembered that these various locations 
of the reference frame for the explanatory parameters in both the 
first-order and the second-order domains have validity only in so far 
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as they are suggestive of fruitful scientific interpretation. If this is 
not the purpose, then the factorial resolution might as well remain in 
the arbitrary orthogonal factors produced by factoring the given test 
correlations—or, better still, by not doing the factoring at all. 

It might be asked how the correlations of a test battery can be 
resolved into a second-order domain of unit rank which is lower than 
the rank of the test correlations. The transitions can be regarded 
geometrically. The unit test vectors usually define a space of as many 
dimensions as there are tests. When the reduced correlation matrix 
is considered, its rank is frequently lower than its order. Hence the 
reduction from the number of tests n to the number of primary 
factors 7 represents a reduction from the total variance of the tests 
to the common factor variance. The complete correlation matrix for 
the primary factors represents a set of r unit vectors in as many di- 
mensions, the dimensionality of the common factor space. The re- 
duced form of this matrix for the example of Figure 2 would 
have unit rank because the side correlations are determined only by 
that which the primaries have in common, namely, the second-order 


general factor. 


4. Group Factors and Primary Factors 

In Figure 4 we have a diagrammatic representation of a differ- 
ent kind of resolution of factors in the second-order domain and their 
relation to the primary factors. In this example the rank of the cor- 
relation matrix of tests is assumed to be five as represented by as 
many primary factors A, B, C, D, and FE. Let it be assumed that the 
correlation matrix for these five primaries is of unit rank. The gen- 
eral second-order factor G then accounts for the observed correlations 
of the primary factors. If the five linearly independent primary unit 
vectors and the second-order unit vector G are to be represented in 
the same space, the dimensionality of this space must be six. It is 
possible to locate in this augmented space another set of unit vectors 
a, b, ce, d, and e which are mutually orthogonal and which are also 
orthogonal to the unit vector G. Then we have the orthogonal refer- 
ence frame G, a, b, c, d, and e which defines the six dimensions of the 
first- and second-order factors but not the test space. The five linear- 
ly independent primary factors define a five-dimensional space corre- 
sponding to the rank of the test correlations, and this space is a part 
of the total six-dimensional space of this representation. 

The unit vector a is a linear combination of the unit vector G 
and the primary vector A. The relation is similar for the other pri- 
mary vectors. The primary vectors A, B, C, D, and E are correlated 
and of unit rank whereas the vectors, a, b, c, d, and e are arbitrarily 
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set orthogonal to each other. In general, if the rank of the test corre- 
lations is 7, and if the correlations of the primaries are of unit rank, 
then the primaries define a unit vector G for a general second-order 
factor in an augmented space of dimensionality (7 + 1) and also a 
set of r mutually orthogonal unit vectors each of which is in the plane 
of the second-order general factor and one of the primaries. These 
vectors are arbitrarily set orthogonal to the general second-order fac- 
tor and they are called group factors. In Figure 4 the primary fac- 
tors are denoted A, B, C, D, and E and the group factors are denoted 
a, b, c,d, and e. With this resolution we have (7 + 1) linearly depen- 
dent factors which represent the test correlations of rank 7. This 
type of resolution is preferred by some students who use the refer- 
ence frame G, a, b, c, d, and e because it is orthogonal rather than the 
frame G, A, B, C, D, and E which is oblique. 


5. A General Second-Order Factor 

The algebraic and computational relations between the first-order 
and the second-order domains will be shown for the case of a single 
general second-order factor because of the interest of this case for 
the psychological controversies of the past forty years about Spear- 
man’s general intellective factor. The algebraic and computational re- 
lations to be shown can be generalized to second-order domains of 
higher than unit rank. It must be remembered, however, that the 
restriction of our discussion to unit rank for the second-order domain 
does not in any way imply that such low rank is always to be expected. 
The methods of analysis can be readily extended to a second order of 
higher rank when the data indicate a determinate second-order con- 
figuration. In any case, the second-order rank should be considerably 
lower than the rank r of the first-order factors in order to justify in- 
terpretation. 

The primary vectors constitute a set of r linearly independent 
unit vectors that define a space of dimensionality equal to the rank 
of the test correlations. In order to represent a general second-order 
factor as a unit vector in the same configuration it is necessary to 
augment the dimensionality to (r + 1) dimensions. A second-order 
domain of rank two would thus require an augmented space of di- 
mensionality (r + 2). The projections of the test vectors on these 
additional vectors in the augmented space can, however, be expressed 
as linear combinations of the test projections on the primary vectors 
or on any set of ry linearly independent vectors in the common factor 
space. The procedures for determining these saturations will be shown 
without writing explicitly the (yr + 1) co-ordinates of the second-or- 
der unit vectors in the augmented space. 
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The present discussion is confined to factorial data that satisfy 
two conditions, namely 1) that a complete simple structure is revealed 
in the test configuration and 2) that the second-order correlation ma- 
trix is of unit rank. These methods can be adapted to the analysis of 
less than r primary factors and the methods can be adapted to higher 
second-order rank. 

One of two objectives will be assumed, namely (1) to determine 
the projections (saturations) of the tests on the second-order factor 
in addition to the projections on the primary reference vectors or (2) 
to determine the projections on the second-order factor and also on 
the orthogonal group factors. It will be convenient to discuss the alge- 
braic relations under four cases because of the different computational 
routes that may be chosen. These four cases are: 


Case 1. Transformation from F to V including the column vector G 


This transformation is shown in rectangular notation in Table 4 
for the equation 
Fin Y mp = Vin, (1) 


in which the matrix V;, has an extra column for the second-order fac- 
tor G with elements v;,, which may also be denoted 7;, because these 
are the correlations between the tests 7 and the general factor G. The 
transformation matrix ¥,,, is identical with A,,, except for the added 
column G with elements ym, which are to be determined. Consider the 
matrix T as an extension of the factor matrix F. The rows of 7 give 
the direction cosines of the primary vectors T; with elements t;, . The 
same transformation gives 


Pim Pimp = Vip, (2) 
which is the diagonal matrix D except for the first column. Applying 
the transformation Y,,, we have 

Tim Ping Fig (3) 
where 7,, is the first column of V; and its elements are the correla- 
tions of the primary factors with the general factor G. These are 
known from the factoring of the unit-rank correlation matrix for the 
primaries. Then 

Png — oa ‘tg ’ (4) 
and, since 7 = D A“, we have 

Y ne =A p> Vtg ’ (5) 








82 PSYCHOMETRIKA 


from which the first column of Y,,, can be computed. Hence the col- 
umn G of the augmented oblique factor matrix V becomes known. 


Case 2. Transformation from F to U including group factors and gen- 
eral factor G 


Here the computation starts again with the orthogonal factor 
matrix F and the objective is to determine the saturations of the tests 
j with the r group faetors and the second-order general factor G. This 
transformation is also shown in Table 4 in rectangular notation by 
the equation : 

Fim P nwo — Uiu, (6) 


where U is the factor matrix showing the projections of tests 7 on 
group factors and general factor G. These (r + 1) mutually orthog- 
onal factors will be designated by the subscript w. The first column 
of this matrix is again the column of correlations 7;,. If the same 
transformation is applied to the matrix 7 for the primary vectors, 


we have 
Tem Pnw — Uw, (7) 


which is also a diagonal matrix except for the first column which con- 
tains the correlations 7;, between the primary factors and the second- 
order general factor G. These saturations can be determined from the 
unit-rank correlation matrix TT’ = R; for the primary factors.* Con- 
sider the first row of U,. The two entries in this row show the direc- 
tion cosines of 7, in terms of the orthogonal frame G, a, b, and c. The 
primary vector T, is a linear combination of the two orthogonal unit 
vectors G and a. Hence, when 7,, is known, we have 


15 + U* 12 = 1 ’ (8) 
or 

Tay =F Waa a 1 ’ (9) 
so that the element w,, is known. The other diagonal elements of U; 
are determined in the same way so that, for example, 

1 By + U gp ——— (10) 
When the matrix U, is known, we have, by (7), 

P mo —T Uz, (11) 


* Elsewhere we have denoted this matrix R,, but we are here using the sub- 
script t for the primary vectors T, and reserving the subscripts p and q for the 
primary reference vectors A, B, and C. Hence the correlations of the primary 


factors are here denoted R, instead of R,,,. 
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or 
Pw = ADU, (12) 


so that the transformation ¥,,, is known. The saturations of tests 7 
on the second-order general factor G and the group factors w can then 
be computed. 

The transformation matrix VY, represents a rigid rotation from 
one orthogonal frame to another orthogonal frame, and hence this 
transformation matrix must be orthogonal by rows. A fourth row 
could be added to ¥,,,, for a fourth orthogonal unit vector IV with 
cell entries which normalize each column. Then we should have an 
orthogonal matrix of order 4 X 4. 


Case 3. Transformation from V to U including the group factors and 
column vector G. 


Here it is assumed that the computations are to be made from 
the oblique factor matrix V. In Table 5 we have the transformation 
equation in rectangular notation, namely, 

Vip Vw = U jw, (13) 
which gives the saturations of the tests 7 on the group factors and 
on the general factor. If the factor matrix V is extended to include 
the primary vectors 7; we have the diagonal matrix D. Applying the 
same transformation to D we have 

Dip Pow = U; (14) 
so that 

Pw — DD U;. (15) 
When the elements of U; have been determined as for Case 2, the 


transformation ¥,,. can be written by merely adjusting the rows of 
U, by the multipliers of D“,,. The transformation ¥,,. is then known. 


Case 4. Transformation from V to column vector G 


This is the simplest case and perhaps the most useful as regards 
the second-order domain. The matrix V is known in determining the 
simple structure and the primaries. The saturations of the tests 7 on 
the second-order general factor G are of interest and these can be de- 
termined as linear combinations of the columns of V. Here we have 
the transformation shown in Table 5, namely, 


Vin P 59 = 1 jg- (16) 
Applying the same transformation to D;,, we have 
Dip Png = tg « (17) 
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The elements of the column vector 7;, are known from the correlation 
matrix TT’ = R, of the primaries. Then 


P 9g = D“ ty Vtg ’ (18) 


and hence the column vector ¥,, is known. In computing, it is only 
necessary to multiply the elements 7;, by the corresponding diagonal 
elements of D-',, to determine ¥Y,,. The desired column vector 7;, can 
then be determined. 


6. A Trapezoid Population 

In previous studies of factorial theory it has been found useful - 
to illustrate the principles by means of a population of simple physi- 
cal objects or geometrical figures. The box population was used to 
illustrate three correlated factors and their physical interpretation. 
In the present case we want four factors in the first-order domain 
which by their correlations of unit rank determine a general second- 
order factor. The correlations of three variables can nearly always 
be accounted for by a single factor and hence it seems better to choose 
a four-dimensional system in which the existence of a second-order 
general factor is more clearly indicated by the unit rank of the corre- 
lations of four primary factors. For the present physical illustration 
we have chosen a population of trapezoids whose shapes are deter- 
mined by four primary parameters or factors. 

The measurements on the trapezoids are indicated in Figure 5. 
The base line is bisected and the length of each half is denoted by the 
parameter c. An ordinate is erected at this midpoint and its length 
is h. This ordinate divides the top section into two parts which are 
denoted a and b as shown. These four parameters, a, b, c, and h, com- 
pletely determine the figure. The test battery was represented by 
sixteen measurements which are drawn in the figure. The parameters 
a, b, c, and h are given code numbers 1, 2, 3, and 4, respectively. Vari- 
ables (12) and (13) are the two areas as shown. The sum of (12) 
and (13) equals the total area of the trapezoid. In general, each of 
these measurements is a function of two or three of the parameters 
but not of all four of them and hence we should expect a simple struc- 
ture in this set of measurements. There is a rather general impres- 
sion that asimple structure is necessarily confined to the positive mani- 
fold. In order to offset this impression we included here three addi- 
tional measures which extend the simple structure beyond the posi- 
tive manifold. The three additional measures are as follows: 

14= (1) / (2) =a/b 
15 = (2) / (8) =0b/c 
16= (1) / (3) =a/e 
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‘These three measures will necessarily introduce negative saturations 
on some of the basic factors. 

In Table 6 we have a list of dimensions for a set of thirty-two 
trapezoids. These will constitute the trapzeoid population. Each fig- 
ure was drawn to scale on cross-section paper and then the sixteen 
measurements were made on each figure. These constituted the test 
scores for the present example. In setting up the dimensions of Table 
6 the numbers were not distributed entirely at random. To do so 
would tend to make the correlations between the four basic para- 
meters a, b, c, and h approach zero and this would lead to an orthog- 
onal simple structure in which there would be no provocation to in- 
vestigate a second-order domain. The manner in which the generat- 
ing conditions of the objects determine the factorial results will be 
discussed in a later section. Table 6 was so constructed that, in addi- 
tion to the four basic parameters, there was also a size factor which 
functioned as a second-order parameter in determining correlation 
between the four primary factors in generating the figures. 

The product-moment correlations between the sixteen measure- 
ments for the thirty-two objects were computed and these are listed 
in Table 7. This correlation matrix was factored by the group cen- 
troid method and the resulting factor matrix F is shown in Table 8. 
The fourth-factor residuals are listed in Table 9, which indicates 
that the residuals are vanishingly small. Applying the rotational 
methods to the configuration, we found the transformation matrix A 
of Table 10, which produced the oblique factor matrix V of Table 11. 
In this matrix we are now concerned with all but the last column. 
When pairs of columns of the factor matrix V are plotted we have 
the configuration shown in the diagrams of Figure 6, in which a sim- 
ple structure is clearly indicated. The cosine of the angle between 
the reference vectors is indicated on each diagram of Figure 6. These 
cosines were obtained from the relation C = A’A as shown in Table 12. 

So far in the analysis we have found that four primary factors 
account for the correlations and this corresponds to the fact that we 
used four parameters in setting up the trapezoid figures. The four pri- 
mary factors are correlated as indicated by the obliqueness of the ref- 
erence axes in the diagrams of Figure 6. The next step is to deter- 
mine the correlations between the primary factors that correspond to 
the primary reference axes. For this purpose the inverse of the ma- 
trix C is computed as shown in Table 12. From the diagonal values of 
this matrix are found the numerical values of the diagonal matrix D, 
which is also shown in Table 12. The inverse of this diagonal matrix 
is also listed. These numerical values are merely the reciprocals of 
the entries in D. 
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In Table 13 we have the correlation matrix R; showing the corre- 
lations between the primary factors. These are the cosines of the 
angles between the primary vectors. It can be seen by inspection that 
this matrix is close to unit rank, which indicates that a single general 
second-order factor can be postulated to account for the correlations 
between the primary factors. The saturation of each primary factor 
with this second-order general factor was determined by one of spe- 
cial formulas for unit rank and the saturations are listed in the col- 
umn vector 7,,. The interpretation is, for example, that the primary 
factor A has a correlation of .71 with the second-order general factor 
G. The closeness of the correlation matrix to unit rank is shown by 
the small side correlations in the residual matrix of Table 13. The 
diagonal values of the residual matrix show that part of the total 
variance of each primary factor which it does not share with the gen- 
eral second-order factor. If the diagonals of this matrix vanished 
completely, then the primaries would have their total variance in com- 
mon and the original reduced correlation matrix for the tests would 
have been of unit rank. 

The saturation of each test with the second-order general factor 
was determined as a linear combination of the columns of the oblique 
factor matrix V of Table 11. The transformation of equation (18) 
was used and the numerical values of Y,, were listed in Table 13. Col- 
umn G of Table 11 was then computed by equation (16). 

The second-order general factor G can be interpreted in this ex- 
ample as a size factor and it also indicates that in generating the 
thirty-two figures the four parameters a, b, c, and h were not allowed 
to take entirely independent values. In other words, the extreme 
forms of figures either did not occur or else they were used only occa- 
sionally. If the four parameters had been allowed to take entirely in- 
dependent values, then there would have been an appreciable number 
of figures in which one of these parameters had an unusually small 
value while some other parameter had some unusually large value. 
This interpretation of the second-order general factor leads to a con- 
sideration of what we shall call generating parameters. The present 
geometrical example illustrates the type of factorial organization that 
is represented diagrammatically in Figure 2. The problem of inter- 
preting the four primary factors can be solved in this case without 
investigating the second-order domain. But if the correlations be- 
tween the primary factors show unexpectedly low rank, then this fact 
can be utilized factorially in gaining further insight into the condi- 
tions under which the objects were generated. The four primary fac- 
tors here identified by the simple structure were the four parameters 
that were used in setting up the problem. 
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7. Generating Parameters 

In addition to the principle of simple structure for the descrip- 
tion of each individual object, we may consider an extension of this 
principle to the problem of describing the manner in which the mea- 
sured objects were generated. Other things being equal, we should 
prefer a set of descriptive parameters that give some indication of 
the conditions that were operative in producing the objects. To the 
extent that a factor analysis can throw some light on the conditions 
that were responsible for producing the objects and their measurable 
characteristics in addition to the description of each individual ob- 
ject, both by some simplifying set of parameters representing causa- 
tive factors, the factorial methods become even more useful as tools 
in scientific work. 

The numerical values of the trapezoid parameters in Table 6 de- 
fined thirty-two figures of various shapes. The method of construct- 
ing the table of four measures for each figure determined whether 
one or more second-order factors would be present and also whether 
each of the primaries would be equally or differently represented in 
the second-order factor. The factorial result could be altered in- 
definitely by the manner in which the objects were generated in con- 
structing Table 6. Since it is the object of factor analysis to reveal 
the underlying order in the domain, it is an essential part of the nu- 
merical example to show that there is a relation between the generat- 
ing principles and the factorial results. 

The first column of the table contains the three linear measure- 
ments 1, 2, and 3. Suppose that these were inserted in the column en- 
tirely at random. Assume that each column was similarly constructed 
by distributing a set of measurements entirely at random. Then we 
should expect zero correlations between the four primaries T,, 7T;, 
T;, and T,. The correlation matrix for the four primary factors 
would be an identity matrix and it would not be factored because the 
primary factors would be statistically independent. There would be 
no second-order factor present. 

If, for each one of these thirty-two figures with uncorrelated pri- 
maries, we should draw another one similar in proportions but with 
twice the area and another one with similar proportions but three 
times the area, then we should have a set of ninety-six figures con- 
sisting of three sets that have similar shapes but different sizes. If 
this new set of ninety-six figures were analyzed factorially with the 
same battery of sixteen measurements, we should find the same pri- 
mary factors but they would be correlated. Furthermore, the corre- 
lations of the primary factors would all be the same, so that we should 
have a correlation matrix for the primary factors with uniform side 
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correlations. The reduced correlation matrix would have unit rank 
and all of the four primaries would have the same saturation on the 
second-order general factor. This would be a situation with a second- 
order general factor which has a uniform effect on all of the pri- 
maries. Here again, the factorial result would be determined by the 
manner in which the objects were generated. 

Suppose that a group of persons were asked to draw some trape- 
zoids of arbitrary shapes and that these trapezoids were assembled 
as a population of figures to be measured and analyzed factorially. 
Then we should almost certainly introduce a second-order size factor 
because our subjects would probably unwittingly draw the figures so 
that the several dimensions of each figure would be at least roughly 
of the same general order of magnitude. Some of the subjects might 
draw trapezoids of the general size of, say, five or six inches while 
other subjects might draw figures only one or two inches across. Very 
few would produce trapezoids that are one or two inches wide and ten 
inches tall. In other words, since some subjects would draw big fig- 
ures and others small ones and since they would probably produce 
very few extreme figures, there would be strong correlation between 
the primary factors and these in turn could be analyzed factorially 
into secondary factors. In this situation the rank of the correlations 
of the primary factors would probably not be exactly one but the in- 
ference could certainly be drawn from the factorial result that sec- 
ondary factors were operative to produce some big figures and some 
small ones in addition to the primary parameters that define the in- 
dividual figures. 

The interpretation of the second-order factor as a size factor in 
the trapezoid example should be distinguished from the size factor 
that could be chosen as a parameter in the first-order domain. If one 
of the measurements had been the total area of the trapezoid, it would 
have been represented by a test vector in the middle of the configura- 
tion since it would be affected by all four of the generating parameters 
that were used and which appeared in the simple structure. The to- 
tal area test vector could be normalized to a unit vector and it could 
be used as one of the parameters for describing the trapezoids. It 
would not be identical with the second-order size factor but they would 
be closely related. Whether a size factor appears as a first-order fac- 
tor or as a second-order factor depends on the restrictive conditions 
under which the figures or objects are produced or selected and also 
on the selection of measurements for the test battery. It is interest- 
ing to note that here the results would indicate either that the thirty- 
two trapezoids had been systematically selected by some restrictive 
conditions or else that the objects themselves had been generated un- 
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der some restrictive conditions. 

When the factorial results are clear in both the first-order and 
second-order domains, inferences can sometimes be drawn concern- 
ing the generating conditions that produced the individual parts of the 
objects. Such inferences can be the basis for formulating hypotheses 
that can be investigated further either by factorial methods or by 
more directly controlled experiments. 


8. Incidental Parameters 

So far we have considered the primary factors determined by a 
simple structure as representing parameters that can be given some 
scientific interpretation in terms of concepts that are fundamental 
for the domain in question. In using the simple structure solution 
which leads sometimes to the second-order domain, we have tried to 
avoid using arbitrary parameters whose only merit is that they serve 
in the condensation of numerical data. We have tried to find in the 
primary factors a set of parameters that not only describe the indi- 
vidual measurements but which also reveal something about the un- 
derlying order in the domain. In looking for meaningful parameters 
of this kind it would be an error to assume that all of the factors have 
significance that transcends the particular experiment or the par- 
ticular group of subjects. It would be strange indeed if factor analy- 
sis were immune from the distracting circumstances of the particular 
occasion. The experimenter must try to distinguish that which is in- 
variant and which transcends the particular experimental arrange- 
ment or the particular experimental group of subjects from that which 
is local and incidental to the particular occasion. In factor analysis 
we are not relieved of this difficult task any more than in other forms 
of scientific experimentation. In order to focalize attention to this 
circumstance it might be well to distinguish the primary factors 
which represent the invariants for which we are really looking from 
those primary factors which, though genuine as regards the explana- 
tion of the test variances, are local and of significance only for the 
experimental group or the particular occasion. Primary factors which 
characterize only a particular experimental group or a particular sit- 
uation may be called incidental factors to distinguish them from the 
invariants which are normally the object of scientific experimenta- 
tion. Incidental factors may appear in the first-order or in the sec- 
ond-order domain. 

A few examples will serve to illustrate the manner in which inci- 
dental factors may appear as primaries in factorial analysis. In addi- 
tion to the primary factors that would be found in different groups of 
subjects, we might find primary factors that are unique for the par- 
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ticular occasion. Suppose that an exceptionally good examiner who 
is skilled in obtaining good rapport with the subjects should give 
a part of the test battery to a part of the experimental group. A 
primary factor might appear for this group of tests and the investi- 
gator might be at a loss to explain it because he would be thinking 
about the nature of the tests and he would try to find something com- 
mon in the psychological nature of these tests. It might not occur to 
him that this is the very group of tests that were administered by 
the experienced examiner. Such a factor would probably be left with- 
out interpretation in the final results or the interpretation might be 
one that would not be sustained in a subsequent experiment with dif- 
ferent subjects and different examiners. Incidental factors are al- 
most certainly present in every study. Hence the investigator should 
fee] free to leave without interpretation those primary factors which 
do not lend themselves to rather clear scientific interpretation. Even 
then the interpretation should be at first in the nature of a hypothe- 
sis to be sustained if possible by subsequent factorial studies. The 
fact that all of the variances are not adequately accounted for in the 
interpretation has led some students to conclude that the whole result 
should be discarded, but such is not the case. It is quite possible to 
make an important discovery concerning the primary factors that are 
operative in an experiment even though the major part of the com- 
mon factor variances remains unexplained. It is assumed, of course, 
that such a finding could be sustained by the construction of new 
tests with prediction as to how they should behave factorially in new 
groups of differently selected subjects. 

In one factorial study it was found that a primary factor was 
common to a set of tests that were given by the projector method 
with individual] timing for each response. The interpretation of such 
a factor was uncertain. Some psychological function might be in- 
volved in the projector tests which was absent from the other tests, 
but the explanation might also be that some motivational condition 
was common to the projector tests that was absent from the other 
tests and which would be of only incidental significance as far as the 
major purposes were concerned. 

Suppose that one of the examiners misunderstands the time limits 
for a set of tests and that he gives the shorter time limits to a part 
of the group of subjects for some of the tests. A factor might ap- 
pear under certain circumstances that would be incidental and of no 
fundamental significance, but the primary factors that are signifi- 
cant might still be revealed. An unexpected interruption in a school 
examination such as fire drill, a street parade, or the expectancy of 
an important school event may act to introduce incidental factors. 
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One of the most important sources of incidental factors is to be 
found in the selective conditions. If a group of subjects is selected 
because of qualification in a composite of two or more tests, the 
unique variances of such selective tests combine to form one or more 
incidental common factors which would have remained a part of the 
unique variance if the selective conditions had not been imposed. The 
correlations between the factors are determined in large part by the 
selective conditions. If a group of subjects is selected because of cer- 
tain test qualifications, it is to be expected that the primary factors 
will show correlations between factors that are different from the 
correlations between the same factors in an unselected population. 
It must not be assumed that the factors are different just because 
they correlate differently in different populations. This effect is well 
known with physical measurements, height and weight with intelli- 
gence, for example, whose intercorrelations are determined in large 
part by the selective conditions. These changes do not affect the iden- 
tity of the factors. An incidental factor which is introduced by con- 
ditions of selection may be trivial or it. may be of significance, de- 
pending on the nature of the unique variances which are introduced 
into the common factors by the selective conditions. 

It should be remarked that in a well planned factorial experi- 
ment the incidental factors are usually of secondary importance in 
comparison with the variance that is assignable to the principal pri- 
mary factors for which an experiment was planned. When one or 
more primary factors have relatively small variance and do not seem 
to lend themselves to clear interpretation, they should be reported 
without interpretation. Some reader of such a report may find a fruit- 
ful hypothesis for it, or the factor may be of only incidental signifi- 
cance. 

These few examples will serve to call attention to the fact that 
not all the primary factors can be expected to have meaning in the 
fundamental sense of representing functional unities whose identity 
transcends the particular group of subjects and the experimental 
conditions of any particular occasion. It does not follow that inci- 
dental factors are in any sense artifacts. They may represent gen- 
uine factors that were operating to produce the observed individual 
differences but their significance may not extend beyond the particu- 
lar occasion. In that sense they are irrelevant to the purposes of the 
experimenter even though they are valid as factors which can some- 
times be identified. 

An interesting application of second-order factors is an attempt 
to reconcile three theories of intelligence, namely, Spearman’s theory 
of a general intellective factor, Godfrey Thomson’s sampling theory 
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with what he calls “‘sub-pools,” and our own theory of correlated mul- i 
tiple factors which are interpreted as distinguishable cognitive func- ' 
tions. The tetrad differences vanish when there are no primary fac- 
tors common to the four tests of each tetrad, the correlations being 
determined only by the general second-order factors. This applica- 
tion of second-order factor theory will be the subject of a subsequent 
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TABLE 4 


Case 1. Transformation from F' to V including the column vector G 
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TABLE 5 
Case 3. Transformation from V to U including the group factor and the 
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TABLE 6 
Trapezoid Parameters 
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TABLE 7 
Correlation Matrix 








1 2 > =a 6 6 9 8S: © 20 41 22. AB Wh. 46: es 


1.00 .50 .50 382 .29 .58 .72 .49 .58 .45 31 .66 .53 .76-35 11 
50100 .50 32 36 .42 .57 49 .74 67 33 .54 .64 -.16 -—14 -.23 
50 .60 1.00 32 .52 .42 88 .82 .90 .45 30 .78 .75 .19 -84 -.72 
32 82 3821.00 95 96 65 .80 .61 .91 98 .78 82 12 —22 -.15 

F <! ‘ é ‘ : : : ‘ 84 89 05 -37 -.31 

58 .42 .42 96 .90 1.00 .78 .83 .70 .92 .94 .86 .86 .384 -.29 -.09 
72 57 88 .65 .75 .78 1.00 .95 .95 .74 .64 95 .91 .89 —69 -.46 
49 49 82 80 .90 .83 .95 100 .93 .83 .78° 94 .95 .19 —64 -.52 
58 .74 .90 61 .75 .70 .95 .93 1.00 .79 60 90 .93 .11 -.64 -.57 

10 45 67 .45 91 90 .92 .74 83 .79 1.00 .90 83 .90 .01 —22 -.21 

11 31 33 30 98 .94 94 64 .78 60 .90 100 .77 .80 .11 -.12 -.09 

12 66 .54 .78 .78 .84 86 95 94 .90 .83 .77 1.00 97 .34-—59 -—39 

13 53 .64 .75 .82 .89 .86 91 .95 .93 .90 .80 .97 1.00 .12 -52 -.44 

14 76-16 .19 .12 .05 324 39 .19 .11 .01 .11 84 .12 1.00 —28 -—34 

15 -.385 -.14 -.84 -.22 -.37 -.29 -.69 -.64 -.64 -—22 -—12 -.59 -.52 -.28 1.00 -.76 

16 11 -.23 -.72 -—15 -—.31 -.09 -.46 -.52 -.57 -.21 -09 -39 -44 .84 .76 1.00 
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TABLE 8 TABLE 9 
Orthogonal Factor Matrix F Distribution of 
I Il Wl Iv semercensins 
. -— — a ae 7 
2 69 —08 —01  .59 00. —-+50 
3 19 —46 38 .08 01 94 
4 81 286 —42 —25 02 46 
5 88 18 —38 —24 03 30 
: 2+ 2 kh 04 8 
7 9 —02 20 .00 05 2 
b i ae we ae 06 2 
we oe ee 07 0 
100 88 25 —36 17 08 4 
. i uhh ete 09 2 
122 97 09 10 —05 10 2 
13 98 01 —.09 06 N= 240. 


14 21 AT 65 —.38 
15 —.60 50 —.44 O7 
16 —.46 Ay i f 02 .08 
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TABLE 10 TABLE 11 
Transformation Matrix A Oblique Factor Matrix V 
A B C H A B C H G 
I 07 12 39 53 1 79 01 —.01 —.02 68 
II 70 —.01 —.81 385 2 01 63 .05 .05 63 
III 71 —.32 28 —.64 3 .00 01 78 .00 73 
Iv —.01 94 —34 —.44 4 01 —.01 .00 93 46 
5 —.11 00 21 86 52 
6 28 03 —.03 76 62 
uf 27 .02 AT 31 84 
8 01 .03 47 55 -74 
9 00 34 46 25 85 
10 —.02 38 —.02 71 64 
11 02 10 —.09 91 AT 
12 .20 04 35 .50 -78 
13 01 .20 33 55 76 
14 81 —.54 01 .03 27 
15 —.01 41 —89 —.02 —.49 
16 52 01 —.82 —02 —. 31 
TABLE 12 
A B C H A B Cc H 
A 1.00 —.24 —34 —.17 Tr, .808 
B —24 100 —35 —.15 T, 808 
C —34 —35 1.00 —.11 T> -786 
H —17 —15 —11 1.00 Ty 912 
Matrix C-} Matrix D-,, 
A B C H dF T, To Ty 
A 1.53 73 83 46 A 1.237 
B 73 =:1.58 84 44 B 1.237 
Cc 83 84 1.62 45 C 1.278 
H 46 44 45 1.19 H 1.091 
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TABLE 13 

Column 
Correlation Matrix R,=D,,C—,, Dn Vector r,, 
7, Tz To Ty tg 
T, 1.00 48 53 34 ig sf | 
T; 48 1.00 53 33 T, -70 
To 53 -53 1.00 32 To 73 
- 34 33 32 1.00 Ty 45 

Column 

Vector 
Residuals = Ry, — Ty 7 t, Vg = D> 9% tg 

T, oe T 4 G 
T, 50 —.02 .01 02 A 878 
Tp, —.02 51 02 01 B 866 
To 01 02 AT —.01 Cc .929 
y 02 01 —.01 80 H 491 
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TECHNIQUE FOR WEIGHTING OF CHOICES AND ITEMS 
ON I.B.M. SCORING MACHINES 


SERGEANT DAVID GROSSMAN 


PSYCHOLOGICAL RESEARCH UNIT NO. 3 
SANTA ANA ARMY AIR BASE 
SANTA ANA, CALIFORNIA 


A technique has been developed which permits the weighting of 
responses of test items on the I. B. M. scoring machine on the initial 
scoring, heretofore impossible. This is done by making the length 
of the response lines on the answer sheet longer or shorter as weights 
are needed. It is anticipated that this method will prove useful 
— differential weighting serves to increase the validity of 
ests. 


Introduction 

In the past, weighting of answers to items on a questionnaire or 
test by scoring machines has been very diilicult if not prohibitive. In 
a single run through the I.B.M. scoring machine, weights of —1, 0, 
and + 1 only can be used. The addition of merely one more weight re- 
quires an extra run through the machine. In two runs a range of nine 
weights can be obtained, while a range of nineteen weights can be 
obtained in three runs. The technique here permits the weighting of 
responses to test items without requiring additional runs through the 
machine at least for the same marks on the answer sheet. In order to 
understand this procedure, a brief explanation of the I.B.M. scoring 
machine is needed. 


I. B. M. Procedures 

To obtain any score on the machine, a separate answer sheet is 
provided on which the testee fills in his choices with a graphite lead 
pencil. Each marked-in space on the answer sheet completes a cir- 
cuit across one set of contact plates on the sensing unit. This current 
goes through a set of resistors to a key-set-unit. There is a pin on 
the key-set unit corresponding to each set of contact plates on the 
sensing unit. 

When a blank stencil prepared for scoring is inserted in the 
front leaf of the stencil holder and pressed against the key-set-unit 
by means of an adjustment of the frame, the pins are pressed all the 
way in. When a graphite mark occurs on the answer sheet to com- 
plete the circuit, the dial will register a minus weight. When a stencil 
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is punched, it allows the pins to go all the way through. If a mark 
occurs on the answer sheet in this position, the weights will be plus. 
If the pin is pushed only halfway in, the item is entirely eliminated. 
This is done by putting a stencil in the front leaf of the holder with 
the desired eliminated items and the plus items punched.. In the back 
leaf of the holder, a stencil is inserted with only: plus items punched. 
This allows the pins of the eliminated items to go only halfway, 
while the pins of the positively weighted items go completely through. 


Weighting 

In weighting responses to tests of specific knowledge (physics, 
mathematics, or any factual material), the validity of each item must 
be established before proper weights can be assigned. It is necessary 
that each answer space be printed uniformly and that the subject’s 
response line, regardless of the weight assigned by the test construc- 
tionist, fill the complete space. This is imperative in order that the 
testee be unaware that his responses are weighted. The weighting 
is effected by the intricate manipulation of the key-set-unit. 

This technique of weighting can be.advantageously employed in 
perceptual and judgment fields, where. degrees of correctness are of 
importance. Here, too, it is necessary that answer spaces be uniform 
and that each response cover the complete space. 

On questionnaires and on tests of personality or emotion, in which 
the subject’s intensity of response is desired, the procedure is re- 
versed and the subject assigns his own weights by the length of his 
pencil mark. Regardless of the range of weights used, each question 
should have response spaces of uniform length. 


Weighting the Individual Items of a Test with 
Specific Weights for Each Item 

Presuppose a test where the validity has been established for each 
response and a range of five weights has been decided on. An answer 
sheet such as that shown in Figure 1, A, will be used. On this answer 
sheet, the response line is twice the length of the presently used form. 
Thus, it will be seen that the testee must mark a line for each item 
that is long enough to extend over the space occupied by two sets of 
contact plates. Let us assume that the assigned weights for question 
one are plus two for response “A,” plus one for “B,” zero for “C,” 
minus one for “D,” and minus two for “E.” Following the principles 
explained in preceding paragraphs, it is necessary to punch out on the 
stencil to go into the front leaf all those choices which are weighted 
plus and those eliminated as shown in Figure 2, A. On the example 
used, A, B, and C will be punched as shown. Since choice D is weight- 
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ed only minus one (half the possible minus weight), it is necessary to 
eliminate the other potential half. On the stencil that is to be placed 
in the back leaf of the stencil holder, the final plus weights are deter- 
mined (as shown in Figure 2, B). Since A is weighted plus two, these 
two holes are punched. Since B is weighted plus one, only one hole is 
punched for it. When the stencil holder is pressed against the pins, 
the key-set-unit will be modified as shown in Figure 3. 


Answer Sheets for Multiple-Choice Tests 

Most aptitude, achievement, and perceptual tests that are elec- 
trically scored are of the multiple-choice type. Therefore, each re- 
sponse space must be uniform. 

It is possible to employ currently used I.B.M. answer sheets, but 
this would require mimeographing over the answer sheets so that the 
length of the space gives the appropriate maximum plus or minus 
weights. Because of the exactness required for each answer sheet, 
however, this method is not recommended. It is far more desirable 
to have special forms printed for each range of weights. It would also 
be desirable to print the spaces on both sides so that more questions 
may be answered on one answer sheet. 

If a range of five weights (—2, —1, 0, +1, +2) is desired, an 
answer sheet with spaces printed to cover two sets of contact plates 
must be printed (Figure 1, A). With a five-choice item, a possible 
total of seventy questions may be answered on one side. When a range 
of seven weights (—3, —2, —1, 0, +1, +2, +3) is needed, the answer 
sheet must be printed so that each space covers three sets of contact 
plates (as shown in Figure 1, B). If this test has five-choice items, 
a total of fifty questions on one side of the answer sheet is possible. 

For all practical purposes, to extend the weights beyond seven, 
it is necessary to reduce the number of choices on any one question. 
It is hardly possible to expect the subject to mark a three-inch line 
both clearly and darkly enough to be picked up entirely by the ma- 
chine. 

Since a given horizontal row contains only fifteen sets of contact 
plates and a vertical row only five sets, an item with a weight range 
of nine or eleven can have only three choices. A question with a 
range of twelve weights or above is limited to two choices. Regard- 
less of the particular weight used, the printed answer spaces must be 
constant, and the subject must fill each in completely. Thus, in the 
case of some weights, the full fifteen horizontal spaces cannot be 
utilized. 

On tests where intensity of response is the important factor and 
the subject assigns his own weight, uniformity of the answer space 
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for each item is desirable. However, the subject must be impressed 
with the fact that he is doing his own weighting by means of the 
number of answer spaces he covers with his graphite pencil. 


Conclusion 
Weighted scoring of tests by hand, as well as the old method of 
machine weighting, is not practical, especially when tests are given 
to large groups at frequent intervals. It is anticipated that the meth- 
od described in this report will prove very useful wherever differen- 
tial weighting serves to increase the validity of tests. 
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FIGURE 1: Answer Sheets 
In the new form the spaces for item one are the same as those covered by 
items one and two on the standard answer sheet. (No. 838). 


A. Five Weights 
| | | 
2. Severn Werghis 
A B C D E 




















Answer sheet similar to the fifteen-choice answer sheet (No. 881). 
Choice A covers same space as did choices A, B, C, on form No. 881 . 
Choice B covers D, E, and F on old forms, etc 


FIGURE 2: Stencil—First and Second i 
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FIGURE 8: Position of Pins When Pushed In 


The letters in front of the pins represent the pins necessary for each choice 
and what position each must be in to give the desired weights in the example. 
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(1) Pin 

(2) Plus contact 
(3) No contact 
(4) Minus contact 
(5) Front stencil 
(6) Back stencil 
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FACTORIAL DESIGN IN THE DETERMINATION OF 
DIFFERENTIAL LIMEN VALUES* 


PALMER O. JOHNSON AND FEI TSAO 
UNIVERSITY OF MINNESOTA 


This paper discusses the application of the principles of fac- 
torial design to an experiment in psychology. For the purpose of 
illustrating the principles, a simple experiment was designed dealing 
with the determination of the differential limen values of subjects 
for weights increasing at constant rates. The factorial design was 
of the type: 4 rates x 7 weights x 2 sexes x 2 sights x 2 dates. 
The appropriate statistical analysis for this type of design is the 
analysis of variance. The mathematical formulation of the problem 
was specified and the appropriate solution for the specific problem 
was obtained. Greater precision results from this type of design, in 
comparison with the traditional psychological experiment dealing 
with a single factor, in that measures are obtained of the effect of 
each of a number of factors together with their interactions. 


Mathematicians have, at least since the time of Euler, been inter- 
ested in combinatorial problems dealing with the arrangement of a 
finite number of things in sets, or patterns, or configurations, satis- 
fying specified conditions. However, it remained for Fisher and his 
associates to show how combinatorial principles can be put to the 
greatest practical use in the designing of experiments. Beginning in 
1922, Fisher laid down principles of designing experiments which 
revolutionized the techniques of agricultural trials and have changed 
the course of the design of agricultural experiments throughout the 
world. In his book, The Design of Experiments,} now in its third 
edition, 1942, Fisher has laid the framework of scientific inference 
and has developed the principles of experimentation which are find- 
ing increasing application in many fields of science. Replication, ran- 
domization, and local control, as fundamental requirements of a self- 
contained experiment, are found to be principles of general utility 
wherever the basic materials are variable. The difficulties met with 
in one field are not identical, but many are similar, to those in other 
fields. The solutions arrived at in one field are often of material as- 
sistance in another. In so far as fields differ fundamentally, new 
techniques are required and these can be developed only in direct con- 
tact with the obstacles themselves. 

* This is one of a number of studies on modern principles of experimental 
design. For the research grant to finance these studies, grateful acknowledgment 


is given to the Graduate School, the University of Minnesota. 
+ Fisher, R. A. The design of — London: Oliver and Boyd, 1934. 
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The new types of design are based upon ideas which differ sharp- 
ly from earlier ones as to the number of enquiries to be included in 
a single experiment. The modern experimenter usually combines sev- 
eral single lines of enquiry in the same experiment. The traditional ex- 
perimenter believes that an experiment must be simple and stresses the 
importance “of varying the essential conditions only one at a time.” 
Among the experimental designs that have been so far developed, one 
in which several different problems are included in the same experi- 
ment is the factorial design. In this design, all the factors to be ex- 
amined are varied concurrently in all possible combinations. The 
chief advantages of this type of design over the traditional experi- 
ment designed to examine a single question, or a single factor, re- 
side in its greater efiiciency and comprehensiveness. These advan- 
tages are attained through the fact that in a factorial experiment 
every trial contributes to the answering of every question with the 
same precision as though the whole experiment had been given over 
to any one of them and that, in addition to measuring the effects of 
each of the single factors the measures of the effects of the inter- 
actions of all combinations of factors are obtained with the same pre- 
cision. The latter advantage is especially great since with separate 
single-factor experiments no measure at all is attainable of the inter- 
action of the different factors. A third distinct advantage of fac- 
torial design lies in the fact that this plan gives results of wider ap- 
plicability than do single experiments, since the exact standardiza- 
tion of experimental conditions prescribed for the traditional experi- 
mental design provides information only in respect to a narrowly re- 
stricted set of conditions. In the factorial design, the ingredients may 
be varied, i.e., applied at different levels, while in the single factor 
experiments standardization requires that the other factors be kept 
constant. Rarely does this standardization of conditions go beyond 
that attained by a more or less arbitrary definition. 

The factorial design promises to be of especial value in the design 
of experiments in psychology where standardization of conditions is 
difficult, if not often impossible, and where the measure of interact- 
ing factors may be of even greater significance than of the direct ef- 
fect. It was to gain at first hand an insight into the nature of the 
modern principles of design and to explore their possibilities that the 
investigation reported here was carried out. It is futile to dogmatize 
about the possibilities of the newer ideas. What they can do can only 
be discovered by trial. The report is divided into two parts. In Part 
I we attempt to exhibit the principles of factorial design in use by 
presenting the design and the analysis of the results of an experi- 
ment in psychology. The existing literature on factorial designs is 
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concerned rather with planning more or less complex experiments 
and with the arithmetical procedures in treating the results than with 
the mathematical theory underlying the method.* Part II, which may 
be omitted by the non-mathematical reader, is devoted to the devel- 
opment of the mathematical foundation fundamental to the appropri- 
ate statisticial analyses. Command of the procedure to be followed 
in attacking problems of this kind should enable the statistician to 
develop and analyze designs for attacking new problems. 


PART I 

The psychological experiment to which the method of factorial 
design was applied consisted in determining the differential limen 
of subjects for weights increasing at constant rates. An apparatus, 
was set up by which it was possible to increase continuously certain 
standard weights at a specified rate. An aluminum pail was attached 
by a hook to a lever mounted on a tripod and the system placed in 
equilibrium by an adjustable screw controlling a movable weight. At 
one end of the lever arm there was a circular ring into which the sub- 
ject placed his index finger, by which he lifted the pail containing a 
weight to which additions were being made at a constant rate of in- 
crease. A graduate of 1000 cc capacity was mounted in a frame and 
filled with water. Four glass tubes of different bores calibrated to 
four different rates were fitted into the bottom of the graduate. These 
glass tubes were connected with rubber tubes entering into the bot- 
tom of the aluminum pail through a valve that controlled the flow of 
water to prevent the impact and sound of flowing water from distract- 
ing the subject. Stop-cocks regulated the flow of water into the pail. 
The four rates of flow were established as 50, 100, 150, and 200 cc 
per 30 seconds. Seven different standard weights were used — 100, 
150, 200, 250, 300, 350, and 400 grams. Each standard weight was 
combined with each of the standard increasing rates so that there 
were in all 28 weight-rate combinations. 

The experimental subjects were four men and four women. Two 
of each sex were normally sighted; the other two were congenitally 
blind. The two normally sighted males were 21 and 25 years of age. 
The two congenitally blind males were of the same respective ages. 
The two normally sighted females were each 33 years old, the same 
age as each of the congenitally blind females. 


* Two very informative publications on applications of factorial design, es- 
pecially the second in relation to the problem discussed here, are: 

. Yates, F. Imperial Bureau of Soil Service. Technical Communication 
No. 35. Imperial Bureau of Soil Science: Harpenden, England, 1937. 95 pp. 

2. Mahalanobis, P. C., and Nair, K. R. Sankhya, The Indian Journal of 
Statistics, 1941, 5, 285-94 [Statistical Society, Calcutta]. 








110 PSYCHOMETRIKA 


Two preliminary trials were run previous to the main experi- 
ment to get the subjects accustomed to the apparatus and experi- 
mental technique. The observer O is standing and if normally sighted 
is blindfolded. The lever arm is placed in the zero position with the 
aluminum pail in balance. O’s index finger is placed in the circular 
slot at the end of the lever arm (either his right or left index finger, 
dependent upon his handedness.) The lever is released and O is asked 
to raise and lower the lever arm from two to three inches and to 
sense the heaviness. A 50-gram weight is then placed in the pail, and 
O is asked to sense the heaviness, indicating by saying “‘ready.” The 
rubber tube with rate 200 cc per 30 sec. is opened and the water al- 
lowed to flow into the pail. O indicates by saying “now” when he 
senses the just noticeable heavierness in the lifted lever arm. At this 
instant the stop-cock shuts off the flow of water. The experimenter, 
E, pours off the water from the pail into a graduate and measures 
to the nearest 0.1 cc. The number of cc of water is recorded as the 
difference limen value in grams. The magnitude of stimulus that 
corresponds to a least sense interval from some positive point on the 
sense scale is defined as the difference limen. The 50-gram weight is 
then removed and combined weights of 450 grams are placed in the 
pail. The graduate is filled to the 1000 cc mark. The rubber tube 
with the 50 cc rate per 30 seconds is then opened and water allowed 
to flow into the pail. O again indicates the moment of just noticeable 
heavierness. 

The main experiment was then carried out. Five differential 
limen values (D.L.) were determined for each 0 on each of the 28 
rate-weight combinations. The order of presentation of each combi- 
nation was established in advance by the use of Fisher and Yates’ set 
of random sampling numbers. Catch stimuli were also randomly in- 
troduced to check the reality of the O’s response. The entire experi- 
ment was repeated on each subject after an interval of one week. 
There were, therefore, 280 D.L. values for each 0. The arithmetical 
mean of the 5 observational values for each individual-rate-weight- 
date combination was used in the analysis of variances. The data con- 
formed toa4x7x2x 2x 2 factorial design, that is, the combination 
of 4 rates, 7 weights, two sights, two sexes, and two dates. The analy- 
sis of the experimental observations follows. 


The Total System of Interactions 
We proceeded first to test the null hypotheses dealing with the 
interactions. The policy is rather often followed, on the basis of find- 
ings in agricultural experiments, to assume that higher-order ‘inter- 
actions are not significant. It is recommended here that interactions 
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of all orders should be tested when the numerical measure of the in- 
teraction can be obtained from the data. In our problem there were 
ten measures of interactions of the first order, ten measures of inter- 
actions of the second order, five measures of triple interactions, and 
one of quadruple interaction... The tests of significance of the twenty- 
six interactions are recorded in Table 1. From these tests it was 
found that the following interactions were significant: 


sex x sight x rate 

sex x sight 

sex x rate 

sight x rate 

sight x weight (doubtful). 


The significant as well as the doubtful interactions were retained 
as specific components in the analysis of variance table. The inter- 
actions shown to be statistically insignificant were included in ex- 
perimental error. We thus found that the 447 independent compari- 
sons among the 448 different differential limen values could be re- 
solved into the 11 components as shown in Table 2. The tests of sig- 
nifiance resulted in the following conclusions: 


significant main effects: sex, sight, weight, and rate 
significant second-order interactions: sight x sight x rate 
significant first-order interactions: sex x sight 

sex x rate 

sight x weight 

sight x rate 


It is important to note that there was no significant difference 
between dates and that no interaction including date was significant. 
This demonstrates that the observations were consistent among them- 
selves. 


The Main Effects of Weights 
The main effect of weights was found to be significant (F = 3.18, 
Table 2). This indicates that the mean D.L. value varied significant- 
ly with the weight. The relation between the D. L. values and weights 
can be expressed by the linear regression equation 


Y = 24.8554 — 9839212 , 


where Y is the predicted D.L. value for a given value of the weight, 


W — 250 
z= _— For this determination it is convenient to use orthogo- 
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TABLE 1 
Tests of Significance of Interactions 
Sum of Mean Test of 
Source of Variation D.F. Squares* Square Fy Hypothesist 
Error 224 34,438 154 _ _ 
Sex x Sight x Weight x Rate x Date 18 270 15 — Accepted 
Sex x Sight x Weight x Rate 18 320 18 — Accepted 
Sex x Sight x Weight x Date 6 379 63 a Accepted 
Sex x Sight x Rate x Date 3 60 20 _ Accepted 
Sex x Weight x Rate x Date 18 538 30 — Accepted 
Sight x Weight x Rate x Date 18 205 11 = Accepted 
Sex x Sight x Weight 6 1,406 234 1.52 Accepted 
Sex x Sight x Rate 3 2,216 739 4.80 Rejected 
Sex x Sight x Date 1 270 270 1.75 Accepted 
Sex x Weight x Rate 18 215 12 — Accepted 
Sex x Weight x Date 6 637 106 — Accepted 
Sex x Rate x Date 3 61 20 —_ Accepted 
Sight x Weight x Rate 18 654 36 — Accepted 
Sight x Weight x Date 6 340 57 a Accepted 
Sight x Rate x Date 3 14 5 —_ Accepted 
Weight x Rate x Date 18 527 29 —_ Accepted 
Sex x Sight 1 14,130 14,130 91.75 Rejected 
Sex x Weight 6 405 68 — Accepted 
Sex x Rate 3 5,720 1,907 12.38 Rejected 
Sex x Date 1 9 9 — Accepted 
Sight x Weight 6 2,089 3848 2.26 Remain in 
Doubt 
Sight x Rate 3 2,083 694 4.51 Rejected 
Sight x Date 1 4 4 — Accepted 
Weight x Rate 18 391 a, —_ Accepted 
Weight x Date 6 488 81 _— Accepted 
Rate x Date 3 61 20 — Accepted 





* For the calculation of sum of squares, see Table 3 in Part II. 


Mean square of a tested variation 
+ Where F = —________—— 
Mean square of error or residual 
By referring to Snedecor’s tables of F (See Snedecor, G. W., Statistical methods applied to experiments 
in agriculture and biology. Iowa: Collegiate Press, 1938, pp. 174-77), we may use the following four rules 
in testing the hypothesis: (a) reject the hypothesis tested, if the calculated values of F is greater than the 
1% point given in the tables; (b) accept the hypothesis tested, if the calculated value of F is less 
the 5% point given in the tables; (c) remain in doubt, if the calculated value of F lies between the 6% 
and 1% points given in the tables; (d) in the event that the mean square of error or residual is 
— 3 mean square of a tested variation, then we simply accept the hypothesis without calculating the 
value of F. 
These conventions are always true throughout the following tables. 


¢ The hypothesis tested is a null hypothesis concerning the variation in the same row. For instance, 
the hypothesis regarding sex x sight x weight x rate x date is that there is no significant interaction be- 
tween sex, sight, weight, rate, and date. To give another example, the hypothesis regarding sex x sight 
is that there is no significant interaction between sex and sight. 
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TABLE 2 
Complete Analysis of Variance of “D.L” Values 
Sum of 5 z Mean 7 Test of 
Source of Variation D.F. Squares* Square F Hypothesis} 
Residual 419 41,692 100 = —_ 
Sex x Sight x Rate 3 2,216 739 7.39 Rejected 
Sex x Sight 1 14,130 14,130 141.30 Rejected 
Sex x Rate 3 5,720 1,907 19.07 Rejected 
Sight x Weight 6 2,089 348 3.48 Rejected 
Sight x Rate 3 2,083 694 6.94 Rejected 
Sex 1 31,534 31,534 315.34 Rejected 
Sight 1 11,810 11,810 118.10 Rejectea 
Weight 6 1,909 318 3.18 Rejected 
Rate 3 51,072 17,024 170.24 Rejected 
Date 1 116 116 1.16 Accepted 
Total 447 164,371 





bite The sum of squares for residual is simply the sum of the sums of squares for error and all 
insignificant interactions which are indicated in Table 1. See Table 3 in Part II for the calculation 
of sums of squares for all the other variations. 

+ The hypothesis tested is a null hypothesis regarding the variation in the same row. For in- 
stance, the hypothesis regarding sex is that there is no significant difference between sex-means. 


nal polynomials.* The linear coefficient was significant; both the 
quadratic and cubic coefficients were insignificant. The analyses of 
variance for testing goodness of fit is given in Table 3. 

The reader who is not familiar with the meaning of linear, quad- 
ratic, and cubic terms, is recommended to refer to Goulden’s Methods 
of Statistical Analysis, 1939, pp. 166-169. To state it here very sim- 
ply, if we have only one degree of freedom regarding the tested vari- 
ation, such as sex and date in our problem, there is only linear rela- 
tionship between the two levels of variation and the “D.L.” values. 
On the other hand, if we have more than 2 degrees of freedom or more 
than 8 levels of variation, then we can separate them into component 
parts: linear, quadratic, cubic, and so on, that are mutually indepen- 
dent. Usually we are well satisfied to calculate up to the cubic term 
even if we have more than 3 degrees of freedom or more than 4 levels 


* To obtain the equations in Tables 3, 4, 5, 8, 11, and 14 the readers are rec- 

ommended to refer to: 

Goulden, C. H. Methods of statistical analysis. New York: John Willey, 1939, 
p. 219-246. 

Anderson, R. L. and Houseman, E. E. Tables of orthogonal polynomial values ex- 
tended to N = 104. Agricultural Experiment Station, Iowa State College of 
Agriculture and Mechanic Arts, Research Bulletin 297, 1942, pp. 595-606. 
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TABLE 3 
Components of Variaticn Due to Weight 
~~ Sum of Mean Test of 
Source of Variation D.F Squares Square F Hypothesis* 
Linear 1 1,735 1,735 17.85 Rejected 
Quadratic 1 113 113 1.13 Accepted 
Cubic 1 24 24 a Accepted 
Remainder 3 37 12 — Accepted 
Weights 6 1,909 





Regression Equations for the Prediction of 
“D.L.” Values from Weights 


A 


Linear Form: Y = 24.8554 — .9839217 , 


Y — 24.2761 — 988921a + 14482922 , 


Quadratic Form: Y = 2 
Y = 24.2761 — .983921x + .144829x22 — .04205623, 


Cubic Form: 


W — 250 
where x = ————— 

50 

* The hypothesis tested is a null hypothesis regarding the variation in the same row on the 
basis of residual term which is indicated in Table 2. For instance, the hypothesis regarding quad- 


ratic is that there is no significant quadratic relationship in estimating the “D.L.’”’ values from 
weights. This will always be true throughout the following tables in the same situations. 


of variation. The calculation of sums of squares for linear, quadratic, 
and cubic terms will be illustrated later. 


The Main Effects of Rates 
The main effect of rates was found to be highly significant, that 
is, the mean D.L. value varied significantly with the rate (F = 170.24, 
Table 2). The relationship between D.L. values and rates was linear 
and can be mathematically expressed by the equation 


Y = 24.8554 + 9.54972, (2) 
R—125 
50 


where x = 


The analysis of variance showed that only the linear effect was sig- 
nificant (Table 4). 
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TABLE. 4 
Components of Variation Due to Rate 
an Me eee 
Source of Variation D.F. Squares Square F Hypothesis 

Linear 1 51,071 51,071 510.71 Rejected 
Quadratic 1 uf 1 — Accepted 
Cubic 1 0 0 — Accepted 

Rates 3 51,072 





Regression Equations for the Prediction of 
“D.L.” Values from Rates 


Linear Form: Y = 24.8554 + 9.549782, 

Quadratic Form: Y = 24.8157 + 9.549782 + .0317x2 , 

Cubic Form:  ¥ = 24.8157 + 9.54978« + .0317x2 — .005672° , 
R—125 


50 


where x« = 


The Sex and Rate Combinations 


A significant difference was found between the mean D.L. values 
of the sexes (F = 315.34, Table 2). 
The equation connecting D.L. values and rate was determined for 


males as 


A Rate Per 125 
Y = 33.2451 + 12.726962, where x = oo (3) 
The equation for females was 
Y = 16.4656 + 6.372682 . | (4) 


The components of variation due to sex x rate are given in Table 
6. The quadratic component is not significant, so the relation be- 
tween D.L. values for sex x rate could be adequately represented by 
the linear equation. 

The summary of the results for sex x rate combinations is given 
in Table 7. The means of the D.L. values for each of the four differ- 
ent rates are given for each sex. The women were more sensitive 
than the men at each of the four rate levels. The meaning and use 
of the constants given in columns (8), (9), (10), (11), and (12) and 
rows (7), (8), and (9) are explained as follows: 

An r simply denotes the mean D.L. value of different rates for 
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TABLE 5 


Interaction Between Sex and Rate 








Equations Connecting “D.L.” Value and Rate 








Sex y =“D.L.” Value, x = pee dna 
50 
Linear: Y¥ = 33.2451 + 12.72696x 
Male | Quadratic: Y = 32.8640 + 12.72696x + 304922 
Cubic: Y — 32.8640 + 12.72696x + 3049? + .406867x° 
Linear: Y = 16.4656 + 6.87268 
Female Quadratic: Y—=16.7675 + 6.37268r — 241522 
Cubic: Y = 16.7675 + 6.37268x — .2415x? — .4080672? 





TABLE 6 
Components of Variation Due to Sex x Rate 











Sum of Mean Test of 
Source of Variation D.F. Squares Square F Hypothesis 
Sex x Rate linear 1 5,653 5,653 56.53 Rejected 
Sex x Rate quadratic 1 33 33 —_ Accepted 
Sex x Rate cubic 1 34 04 Accepted 
Sex x Rate 3 5,720 





each sex and total. B,, C,, and D. denote the linear, the quadratic, 
and the cubic regression coefficients for each sex and total, respec- 
tively. These values can be found in Tables 4 and 5. 

&',(s) denotes the linear polynomial coefficient for each sex; 

&',(r), &'2(7), and &',(7) denote the linear, the quadratic, and the 
cubic polynomial coefficients for each rate, respectively. All of the 
coefficients can be found in Fisher and Yates’ Table 22.*+ 

The sums of squares of various components in Tables 4 and 6 
can be calculated from the figures given in Table 7: for instance, to 
obtain the sums of squares due to the component rate linear (see 
Table 4), we have (i) to multiply each of the 4 sums of D.L. values 


- - Fisher, R. A., and Yates, F. Statistical Tables. Oliver and Boyd, 1938, pp. 
-59. 

+The constants in later tables, i.e., Tables X, XIII, and XVI, have similar 
meanings and use. 
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TABLE 8 
Interaction Between Sight and Weight 








Equations Connecting “D.L.” Value and Weight 




















: W — 250 
Sight Y= “D.L.” Value, x = ————— 
50 
Linear: Y = 19.7210 + .082711x 
Normal Quadratic: Y = 19.5086 + .082711x + .054349x2 
Cubic: y= 19.5036 + .082711x + .054349x2 — .030556x3 
Linear: Y = 29.9897 — 2.050564 
Blind Quadratic: Yy = 29.0485 — 2.050564x + .2353x2 
Cubic: y= 29.0485 — 2.050564” + .23538x”2 — .058558a3 
TABLE 9 
Components of Variation Due to Sight x Weight 
Sum of Mean Test of 
Source of Variation Dr. Squares Square F Hypothesis 
Sight x Weight linear 1 2,039 2,039 20.39 Rejected 
Sight x Weight quadratic 1 44 44 — Accepted 
Sight x Weight cubic 1 2 2 — Accepted 
Remainder 3 4 1 a Accepted 
Sight x Weight 6 2,089 





in row 5 (or mean values in row 6) by &',(7), which occurs in the 
same column as itself; (ii) add up the 4 products; (iii) square the 
sum; (iv) divide by 112, which is the number of D.L. values for each 
sex [if in (i) we use the mean values, then we multiply by 112 in this 
step]; and (v) divide by the sum of squares of &,(7), namely, 20. 
Following these steps, we obtain 


[1183.0(—3) + 2245.4(-—1) + 383815.1(+1) + 4391.7(+3) ]? 
112 (20) 


= 51071 





or 
[10.5625(—3) + 20.0482(—1) + 29.5991(+1) + 39.2116(+3)?] (112) 
20 


= 51071. 





For another example, to obtain the sum of squares due to the 
component sex x rate quadratic, we have (i) to multiply each of the 
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8 sum of D.L. values (or mean values) by two numbers, namely, val- 
ues of the &’,(s) and &.(7), which occur in the same column and in 
the same row, respectively, as itself; (ii) add up the 8 products; (iii) 
square the sum; (iv) divide by 56, which is the number of D.L. val- 
ues for each sex-rate combination [if in (i) we use the mean values, 
then we multiply by 56 in this step]; and (v) divide by the product 
of the sum of squares of &’,(s), namely, 2, and of the sum of squares 
of &’.(7), namely, 4. Following these steps, we obtain 


‘@ oD 
oD 


| 


(56 


| 


2(4) 


56 (2) (4) 
—1)(—1) + 38.9375(—1) (—1) + 52.7625(+1) (—1) 


—1) (—1) + 2180.5(—1) (—1) + 2954.7(+1) (-1) 
13.1536 (—1) (+1) + 20.2607(—1) (+1) + 25.6607(+1) (+1) 


736.6(—1) (+1) + 1134.6(—1) (+1) + 1487.0(+1) (+1) 
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TABLE 11 } 
Interaction Between Sight and Rate 








Equations cane “D.L.” Value and Rate 


y “DL.” Val Rate — 125 
; = “D.L.” Value, « = ————_——_ 
Sight 50 





19.7210 + 7.66658x 
20.1523 + 7.66658” — .34505x? 
20.1523 + 17.66658« — .34505x2 — .40742x3 


Linear: 
Norma! Quadratic: 
Cubic: 


m> > p> 
| 





Linear: Y — 29.9897 + 11.43304x 
Blind Quadratic: Y¥ 29.4794 + 11.43304a + 4084522 
r= 














Cubic: 29.4794 + 11.43304a + .40845x2 + .4063x 
TABLE 12 
Components of Variation Due to Sight x Rate 
Sum of Mean Test of 
Source of Variation D.F Squares Square F Hypothesis 
Sight x Rate linear 1 1,986 1,986 19.86 Rejected 
Sight x Rate quadratic 1 64 64 — Accepted 
Sight x Rate cubic 1 33 33 a Accepted 
Sight x Rate 3 2,083 





All the other values of sum of squares in Tables 4 and 6 can be 
calculated in the same way. Similarly, the values of sum of squares 
in Table 3 can be calculated in the same way from Table 10. 


The Sight and Weight Combinations 
It was noted that a highly significant difference in mean D.L. val- 
ues was found between the normally sighted individuals and those 
who were congenitally blind. A significant influence was also found 
between sight x weight (Table 2). The equations between D.L. val- 
ues and weight were established as follows: 


A 


For normal sight Y = 19.7210 + .08271llz, (5) 
For congenitally blind Y= 29.9897 — 2.0505642, (6) 
where x = Hoan (Table 8). 


Only the linear component of the sight x weight interaction was 
found to be significant (Table 9). 
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Table 10 summarizes the results for the sight x weight combina- 
tions. It is noted that the mean D.L. values for each of the six weights 
was greater for the blind, that is, the congenitally blind were as a 
group less sensitive than the normally sighted. 


The Sight and Rate Combinations 
The equations connecting D.L. values and rate were established 
as follows for the normaliy sighted and the congenitally blind, re- 
spectively: 


Normal Y= 19.7210 + 7.666582, (7) 


Blind Y=29.9897 + 11.433042 , (8) 


R 25 
where x = —— (Table 11). 


The analysis of variance applied to testing the goodness of fit 
showed that only the linear component of variation due to the sight x 
rate combination was significant (Table 12). 

The summary of the results for the sight x rate combinations is 
recorded in Table 13. The normally sighted individuals as a group 
were consistently more sensitive than the congenitally blind as dem- 
onstrated by the significantly greater mean D.L. value for the latter 
at each of the four rates of increasing weights. 


The Sex x Sight x Rate Combinations 

The sex x sight x rate combination gave the only statistically 
significant second-order interaction (Table 2). The interaction among 
sex, sight, and rate shows whether or not the pattern of variation in 
D.L. values with sex and sight remains the same or not from rate to 
rate, whether or not the pattern of variation in D.L. values with sight 
and rate remains the same or not from sex to sex, and whether or not 
the pattern of variation in D.L. values with sex and rate remains the 
same or not from the normally sighted to the congenitally blind. The 
three statements are logically the same and there is one numerical 
measure of the interaction obtainable from the experimental obser- 
vations. 

We fitted the following equations connecting the D.L. values and 
rate for each of the four situations (Table 14): 


Normally sighted males Y = 22.4946 + 8.862162, (9) 
y=4 


Congenitally blind males 3.9955 + 16.59180z , (10) 
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TABLE 14 
Interaction Between Sex, Sight, and Rate 








Equations Connecting “D.L.” Value and Rate 
































A R— 125 
Sex Sight Y="“D:L.” Vane, 2 i a 
Linear: Y = 22.4946 + 8.86216x 
Normal Quadratic: Y= 22.8138 + 8.86216« — .25535x2 
Cubic: Y == 22.8188 + 8.86216” — .25535x22 + .0940333 
Male a ss 
Linear: Y = 43.9955 + 16.59180x 
Blind Quadratic: Y = 42.9140 + 16.59180x2 — .865202 
Cubic: ¥ = 42.9140 + 16.59180x — .8652022 + .719667%3 
Linear: ¥ = 16.9473 + 6.471082 
Normal Quadratic: Y —17.4906+ 6.47108" — .48485x2 
Cubic: ¥ = 17.4906 + 6.47108« — .43485a2 — .908900x3 
Female ——— 
Linear: = ¥ — 15.9839 + - 6.274302 
Blind Quadratic: Y = 16.0442 + 6.27430x — .04820x? 
Cubic: Y = 16.0442 + 6.27430” — .04820x2 + .092833x3 
TABLE 15 
Components of Variation Due to Sex x Sight x Rate 
Sum of Mean Test of 
Source of Variation D.F. Squares Square F Hypothesis 
Sex x Sight x Rate linear 1 2,199 2,199 21.99 Rejected 
Sex x Sight x Rate quadratic 1 15 15 a Accepted 
Sex x Sight x Rate cubic 1 2 2 a Accepted 
Sex x Sight x Rate 3 2,216 





Normally sighted females Y=16.9473+ 6.47108¢, (11) 


A 


Congenitally blind females Y= 15.9889 + 6.274302 , (12) 
R — 125 

50 
The quadratic and cubic equations were also established but only 


the linear component of the sex x sight x rate variation was signifi- 
cant (Table 15). 


where x = 














uD 
N 
re 


PALMER 0. JOHNSON AND FEI TSAO 



































T+ s— 6+ t— (4) *)3 
t+ tT I— t+ (4)*3 
e+ T+ t= $— (4) 3 
199000°'—  OLI80°— _8L6PS°6 _-PEG8"FZ PSS8r7e YITS6E T66S'6SZ Z8PO'OZ Sz9c'OoT uvey [210], 
GGéITtT LITé6é&h TSt&S PFS2Z OESTI Setorg 
jo uns 
€88260" OZ8rO'— _ OkFLZ'9. «= UEBE'GST «T+ 6E86'ST OSGLE'SZ LS8O'6T YOSLEZT 7967'9 uso pullg 
2O6LT 8 G‘OTL VVvEs vS9E 6°T8T Sa100$ : 
yowns 
I “bh a[eula 
006806°— S8Prer’'— 8OTLP'9 @LP69I T— SLPG'9T P9V6E'SS LSGEVIZ YM8ZEET Y98LO'L uLoW [euLION 
T868T G9SL 2009 oSLé 2 86T Sa.109S$ 
i Aeatie jo wimg 
LOQ6TL’ 0298" OST6S'°9I se6e'sr T+ GG66°SP P9669 QSLL'OG T28srss TLSL6T uve pug 
GQLZ6p O'6S6T STZ S°§66 o SSG S$9109$ : 
young 
oo a[eWw 
€E0P60" gegcz’— 91z98'8 9Ferz~ I— 9F6P'22 =LOOSSE F9EO'LZ YSOF'ST ELITES ues [@ua0oN 
v6ISS L°S66 L°sgh ests L’6FZ% S8L09§ 
jo ung 
“a be “a “Vv ("3 (s)"3 TeI0L ~— 002 OST 00T 0s WSIS Xag 
a7eY 

















SUOIJBUIQUIOD B3eY X WYBIG xX x9g IOz synsey jo Azewung 
9T HTAVL 











126 PSYCHOMETRIKA 


Table 16 summarizes the results presented in this section. In 
addition, the cbserved mean D.L. values for each of the four rates 
are given for the sex-sight combinations. The mean main effects of 
rate are shown in row (10). The mean over-all rate effects of the 
sex-sight combinations are given in Column (8). In the case of the 
men, the normally sighted were uniformly more sensitive than the 
congenitally blind. On the other hand, the congenitally blind women 
appeared to be on the average slightly more sensitive to increasing 
rates than the normally sighted women. The normally sighted women 
were more sensitive than the normally sighted men. Likewise, the 
congenitally blind women were more sensitive than the congenitally 
blind men as they were also than the normally sighted men. 


The Summary and Conclusions 

The principal purpose of the investigation was to show the ap- 
plication of modern ideas of experimental design to an experiment 
in psychology. The particular type of design chosen for illustration 
was the factorial design, a type of design offering much promise in 
the study of the effects and interactions of psychological factors. The 
appropriate statistical analysis for this type of design is the analysis 
of variance, which was applied to the experimental observations. In 
addition, the mathematical formulation and solution of the problem 
were carried out; the application of the formulation was made to the 
psychological problem under consideration (See Part II, following 
this summary). 

The psychological problem studied was that of determining the 
differential limen of subjects for weights increasing at constant rates. 
Since there were seven different weights and four different rates of 
increase, there were twenty-eight different weight-rate combinations. 
Five trials of each combination were made at each of two experi- 
mental periods separated by an interval of one week. The subjects 
were of both sexes, equal numbers of congenitally blind and of nor- 
mally sighted individuals. The factorial arrangement was of the type: 
4 rates x 7 weights x 2 sexes x 2 sights x 2 dates. 

The principal findings were: 

1. The adequacy and reliability of the experimental technique were 
demonstrated by the fact that no appreciable difference was found in 
the two sets of measurements taken at an interval of one week. The 
findings from check control stimuli also supplemented this result. 

2. From the tests of significance four first-order interactions—sex 
x sight,:sex x rate, sight x weight, sight x rate—and one second-order 
interaction, sex x sight x rate, were found to be significant. In addi- 
tion to this direct experimental finding, the finding makes a signifi- 
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cant contribution to methodology in that it shows that we should first 
test all interactions before making a complete analysis of variance in 
any problem. 

3. The means of the differential limen (D.L.) cine varied syste- 
matically with changes in rates. They became greater: with the in- 
creases in rates. 

4. The means of the D.L. values varied systematically. with changes 
in weight. They became less with increases in weight. : 

5. Significant differences between the means of the D.L. values were 
found for both sex and sight for the subjects used in this experiment. 
The women were more sensitive than the men at each of the four rate 
levels. The normally sighted as a group were more sensitive than the 
congenitally blind at each of the six weight levels and at each of the 
four rate levels. The normally sighted men were more sensitive than 
the congenitally blind men at each of the four rate levels. No marked 
difference was noticeable between the normally sighted women and 
the congenitally blind women at any one of the four rate levels. 

6. The mathematical relation between D.L. values and each of the 
factors, and the interacting factors were established. There were 
twelve regression equations, as follows: 


D.L. value on weights. 

D.L. value on rates. 

D.L. value for each sex on rates. 

D.L. value for each sight on rates. 

D.L. value for each sight on weights. 

D.L. value for each sex and for each sight on rates. 


hoanr op 


The regression of D.L. values in each case could be graduated by a 
linear equation. The analysis of variance was used to test goodness 
of fit of orthogonal polynomial coefficients. With these. graduating 
equations it is possible to compute D.L. values for any particular 
value of the independent variables within the range of factor levels 
used in the experiment. The known relations can also be useful in 
solving problems of estimation. 

The methods illustrated in this investigation, and modifications 
and extensions of them, are capable of very wide application. The 
general principles can be utilized to various degrees and in a number 
of ways. 

The mathematical statistician is being called upon to develop and 
analyze statistical designs of increasing complexity since the intro- 
duction of the analysis of variance and co-variance. The mathemati- 
cal formulation and solution of the problem presented in Part I fol- 
low. . 
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The reader who is interested in securing a broader background 
for the mathematical development in Part II is referred to the first 
comprehensive discussion of the general class of statistical hypothe- 
ses, known as “linear” hypothesis, given by St. Kolodziejezyk (4). 
The bulletin by Jackson (2) and the paper by Johnson and Neyman 
(3) will also be useful especially in relation to educational and psy- 
chological problems. For a review and bibliography of recent de- 
velopments the paper by Camp (1) is excellent. 


PART II: MATHEMATICAL FORMULATIONS 


Before we develop the mathematical formulations for our prob- 
lem in particular, we start with the derivation of those equations nec- 
essary in the analysis of variance with only two classifications—say, 
sex and sight. The numbers in different subclasses are always assumed 
to be equal. We denote by X,;; the D.L. value obtained by the ¢-th indi- 
vidual in the s-th sex of the i-th sight. The basic assumption in the 
analysis of variance is that we may write 


Xsit —A +5, +C, +f + 2sit» (1) 


where s = 1,2;1=1,2;t=1,2,---, mn; 2 denotes the number of 
sexes and also of sights; and n denotes the number of individuals in 
each subclass; B, is a measure of the s-th sex; C; is a measure of the 
i-th sight; J,; represents the influence of the interaction between sex 
and sight; z,;, represents the error. A is defined as the mean for all 
groups and individuals; so, furthermore, we assume that 


sy B=0 
Sc=0\. (2) 
SEI=0 





To obtain the solution, we first write 
25 (Xsit wa A a B, saa C; bie I,;)? 
e 4: ¢ 
—-2A, TB,+h SC +a Trl), (3) 
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where /,; , 42, and A; are the undetermined multipliers of Lagrange. - 
Minimizing 7? with regard to A , B, , C; and Is; , we obtain* 


i | ae 
A= SUTIXw=X, 
ett 


where N =2X2 Xn, 











> hi > 1s: 
ee ee re | 
ae sl! a 
> Le > 13: je 
1 . Ae oe ey . Ac .(4) 
C;=— Xsit — A — — + — = X,,. — X... — —_ + — 
_ 2  2n 2. On 
t= 8X, ~a- 8-64. 
nN + n 
From equations (2) and (4), we get 
SIs: a ; 
DB.= DX... — 2X... - —p— tS, (5) 
8 8 “ n 
which reduces to 
A, =0. (6) 
Similarly 
AA, == 0: .t (7) 


By the method. of elimination, we get 

L'a a ae é (Xsiz a X.i.)? = ps p > X 5:2 —_ z ={n Xi.) ° (8) 

S & # a ae e 
The hypothesis we wish to test first is 
#,: 1,90, (9) 

i.e., the hypothesis that there is no influence of the interaction be- 
tween sex and sight. Assuming that H, is true, we have, from equa- 
tion (3), 

ee =TIU(Ksir — A — Be — Ci)? —2(u DB, + a >C;i), (10) 

“4 i Te & i 

where a, and a, are the undetermined multipliers of Lagrange. 


* A dot in place of subscript is used to indicate that the variable has been aver- 
aged with respect to that subscript. This will always be true throughout this 


paper. 
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Minimizing x7, with regard to A, B,, and C;, we obtain 





A=... 
R= +%..4+ ei. (11) 
= 
C,;=X.,.-X... pea. 
where 
a,=a,=—0. (12) 


Substituting these values in equation (10) and simplifying, we obtain 
the relative minimum value ,’,,: 


==. = a (Xsse — Baw ‘ae } am = fs X...)? 


a ie 


Pot CS Su. ~ X,.. — ZX, + X...)* 
ae A ; _ as) 
=77,+ 5 S({n X2,;.} — S{(2n X,..} — S{(2n X2.;.} + NX... 
= 72, + 7%. 
Then we may test the relative hypothesis on the basis of %: 
H,:B,=0, (14) 


i.e., the hypothesis that there is no significant difference between sex. 
Assuming that H, is true, we may write 


Ge H=TTETU(Asie — A — Ci)? — 2B SC;, (15) 


where £ is the undetermined multiplier of Lagrange. Minimizing A 
with regard to A and C; , we have 


I| 
Pel 


A 
a" we (16) 
C;=—X.;. wae... + — 
2n 
where 
B=0. (17) 


Substituting all the values into the equation (15) we obtain 
Me, = DVT (Kase —X.i.)? = 721 + 70 + T(2 0 X*,..} — NX?... 


ett 
18 
=at xo+ xr. en 
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Finally , we may test the relative hypothesis on the basis of x: 
H,:C,=0, (19) 


i.e., the hypothesis that there is no significant difference between 
sights. 


Assuming that H, is true and proceeding as before, we obtain 
= ST V(Xsiz — X...)?= a —N X?.., 
a. SS e @ 4 


- - (20) 
= 77, + x20 + 472 + X{2 nm X7.;.} — N X?... = 474 + xoit x71 '+ x0. 


From the equation (20), the additive property of the sum of 
squares is readily demonstrated. It is also noted, in the case of equal 
numbers in subclasses, that there is only one answer for each hypothe- 


TABLE 1 
Analysis of Variance of D. L. Values of Different Sexes and Different Sights 






































Source of Variations d.f. Sum of Squares* 
Error N—4 rhe te — 22 {n X?,;} 
Sex x Sight 1 = B(n X2,;.) — Z(2n X?,..} — D{2n X2.;.} + NX... 
Sex 1 ={2 n X2,..} — N X2... 
={ X2,..}—N X2 
Sight 1 eee er Se 
Total v0 | 7 ee 
* Mathematically speaking, 
=xX)2 = 
=—i 2 . 
N 
s by 
ers ~e sit : 
So we have written N X?... instead of fee — ) 
| (2 z ad? 
=(2n X?2,..} instead of > ee ’ 
(= = X sit) 
and’so on. However, in doing calculating work, > )—————— 
8 an 


is more accurate than ={2 n X2..} from the viewpoint of significant figures. We 


use the latter in the presentation of the formulas only for the sake of simplicity. 
This holds true throughout the present article. 
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sis tested whatever the order of testing is. This point should always 
be kept in mind throughout the present paper. All the results ob- 
tained may be summarized in Table 1. 

Now we shall work out the equations with three classifications— 
say, sex, sight and standard weights. Denote by X,i;: the D.L. value 
on the j-th weight obtained by the ¢-th individual of the s-th sex of 
the i-th sight. 


The basic assumption in the analysis of variance is that we may write 
Xsijt— A + B, + C; + D; + 155 +15 + Ii; 7 Ii; T Zsijt » (21) 


where s = 1,2;4=1,2;j=1,2,---7;t=1,2,---,m; 2 denotes the 
number of sexes and also of sights; 7 denotes the number of weights; 
and n denctes the number of individuals in each subclass; B, is a meas- 
ure of the s-th sex; C; is a measure of the i-th sight; D; is a meas- 
ure of the j-th weight; J,; represents the influence of the interaction 
bewteen sex and sight; /,;, sex and weight; /;;, sight and weight; 
I,i; , sex, sight, and weight; z,:;,, the error. A is defined as the mean 
for all groups and individuals. Furthermore, we assume that 


=B.=0, F215;=0 
8 8s j 
ZC:=0, FLL;=0 
i + j 
>D;=0, YX Di; =0 
/ os # 


TZi4=0, 





To obtain the absolute minimum, we write 


Pelee anit ~A-B,-C;— DB, - ty — ty — ty — 8)" 
20k ae pees 
“ewes th Ci ta 5D; +i Folate ee lx 
8 i j 8s i 8 j 
(23) 
tae DD ta TET), 
Ee 


se -Z 


where /,, 4:, --- , 4: are the undetermined multipliers of Lagrange. 


Minimizing y? with regard to A ,B,, Ci, Dj, Ii, Ig; Ti; and I,i; , we 
have 
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A=X 
Ds: > 1; 2 > Lsi5 
ais es 4 j 4 j Ax 
Be= Kye — Xe — 2 ; 4 da 
ti. Bio BE 
aa a 8 5] 8 7 Ae 
C= 2a - Ben — — 
2 7 14 l4n 
2D 1,5 Di; TD Mai; 
Re Ee & 
, (24) 
Shi Shy Dw 
a — 7 3 J As 
I,,= Xqi.. — X.... ~- B, — Ci —- _ — + 
7 7 7 Tn 
De FF >> Ii; > laij 
I; — } a: en ~~ D; = — oa 15 ae 
2. 2 2 2n 
> Lai Mi; D 1s; 
i= X.ij;. — Xu... — C; — D; — sa as ro. 
2 2 2 2n 
145 = } = } = B, = C; — D; —s I; —* I; =— Ii; +4 
where 
A=A=A,=---=—1;—-0. (25) 
By the method of elimination, we obtain 
a _ a 
a= STD ZV(Aoije — Xs)? HTSSTS X15 — TS Sl{n X55}. 
s¢&¢ ff F a > 8 stf 
The hypothesis we wish to test first is 
Hp, : | PY, =0, (27) e 


i.e., the hypothesis that there is no significant influence of the inter- 
action between sex, sight, and weight. Assuming that H,, is true, 


we have, from equation (26), 
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He =TITU(Asise — A— Be— Cy — Dy — Tos — Toy — Tis)? 
, Boggs 


—2(a, DB. +e Yi ta DD; tad Sli tad > Ls; 
8 i j 


ss 8 j 












































2 
+a. 251y), _ 
ar 
where a,, G2, -:: , a are the undetermined multipliers of Lagrange. 
Minimizing Xr, with regard to A, B,, Ci, Dj, fai, Is;, and Ii;, we 
have 
howd 
ad FF a I,; 
a _ i i 
B, = . bai xX oz il = 
2 2 14n 
>> I; 2 Ii; 
= one 8 7 
C;= Xx... — X.... — ery we. 
: 7 14 
a = > I,; = Ii; a 
D,;= X..;. — X.... - —— - ——- + = 
j , 2 2 2n ; (29) 
p I; > Ii; 
a ee j j Ola 
T435= Xai Aw -B, — C; — ——-+ 
if 7 Tn 
z Res ps Ii; 
eR A oe oe 
2 2n 
ol I,; = I j 
bmn, ~ 2 -C: ~D, — os ene eee 
a 2 2n 
where 
, —2,>—---=6,>—-8) . (30) 
By the method of elimination, we have 








ir, — 2 ys p D(Xaije said Reis. -F } ine , a + } + ; + Bis ia X00)? 








PALMER 0. JOHNSON AND FEI TSAO 135 


es & fF 


=. + DD Dd(n X%i;. } ~ BET 0 Kes }— DS D(2n X2,,;.} 


st fj s 9 


—Dd(2n X. i) + Zl4n X,. } + Bld n Xe, J+ Bank, a} — NX... 


a | 
= fa tT Fe: (31) 
Then we may test the following relative hypothesis on the basis of 
Hrs 
,,: [,,=0, (32) 


i.e., the hypothesis that there is no significant influence of interaction 
between sex and sight. Assuming that H,, is true, we may write 


1 LTT A (Aasie — a~ OB, - 6, — 0) — fy ~taP 
eK es 


(33) 
~ 2B, 2B, + Bas tha; + B53 1,5 + BS S505); 
8 j ¢. -§ 
where f,, 62, °:- , 8; are the undetermined multipliers of Lagrange. 
Minimizing 7’,, with regard to A, B,, Ci, Dj, I.;, and Ii;, we have 




















AxZ 
I, 
ro a 7 B, 
B,=X,... — X.... — —+— 
7 1417 
G08, = 2..— +45 
> 15 D1; f 
bea _ 8 é Bs 
DF — Bi ee secs 
2 2 4n 
Peg 
L,j=X —X..- B-D,- dp oe 
2 2n 
> 1; 
Ty 2 Xu; — X....— C;-—D;- + ; 
2 2n 
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where 


bi = pPo=-++ Bs =0. (35) 


By the method of ean we have 


Xr ge BVT (Kaige — X4.j- — X.i;. + X..; .—2X....)? 


> s-9 t 


= 7", wc + 3 D{7n X?,;. } — Zf{14n 2, | 


8 j 
— Bld aX, .} +NX?.. 

=fete ta. i: 
Next we wish to test the following relative hypothesis on the basis of 
a o> |, <0 
me % I,,=0 (37) 
i.e., the hypothesis that there is no significant influence of interaction 

between sex and weight. Assuming that H,, is true, we may write 


rr, SZ TTI Asijn — A-8,-€, — D, —14)* 


a Se oe 


(38) 
—2(y>B,+ yo DC; + vs 2 Dj +: pe ’ 
8 U j oF 
where y; , 72, ys, and y, are the undetermined multipliers of Lagrange. 
Proceeding as before, we obtain 


tee p = DXi; —X X ij. = | + ; 


oe ee ee 
= ya t Ho, + 770, + D D{2n X*.;.} — S14 n X,...} 
oid & 
— D{4n X?..;.} + NX... 
j 
=a t+ 70, + Ho, + 770. 
Again, we wish to test the following relative hypothesis on the 
basis of Hr, 


(39) 


HA,,: 1;=0, (40) 
i.e., the hypothesis that there is no significant influence of interaction 
between sight and weight. Assuming that Hp, is true, we have 

Xr, Pe A—B,—C, — D;)* 


ee 


— 2(6 2B, + be BC; + Os Dj), _ 

















PALMER 0. JOHNSON AND FEI TSAO 137 


where 6,, 62, and 6; are the undetermined multipliers of Lagrange. 
Proceeding as before, we get 


Hr =TITU (Kun - Xy.. — Xuje. — Xj. + 2 X...)? 
. a4 7 ¢ 
= 77. + 70, + x°o, + xo, + D Dl2 0 X43} 
s. Ff 


— S{14 n X2.;..} — D4 X2..;.} + NX... 


y] 


(42) 


= eet He + un Tat Ku 


After we have tested all the hypotheses dealing with interactions 
either of first order or of second order, we come to the tests of main 
effects. The first hypothesis we wish to test is 


H,:B,=0, (43) 


1.e., the hypothesis that there is no significant difference between sexes. 
Assuming that this hypothesis is true, we may write 


Hr = TZTTI(Asiit -—-A-C;,- D;)? 
~3, 20; +a dD), (44) 


where ¢«, and «, are undetermined multipliers of Lagrange. Proceed- 
ing as before, we have 


H =BBE Blan — By. 2.4. + X..)* 
s ££ 35 
= y?, + x%, + x70, + 770, = xo, + D{14 n X2,...} — N X24... 


. (45) 
= 7"o + xo, + Xo, + Ho, + Ho, + 1 


The next hypothesis is 
H,.:C;=0, (46) 

i.e., the hypothesis that there is no significant difference between 
sights. Assuming H, is true, we may write 

Vr = TTT XU(Xaije — A — Dj)? —2p TDi, (47) 

a oF 8 i 

where p is the undetermined multiplier of Lagrange. By using the 
same method as before, we obtain 
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Pr =TTTVV (Kei — Xj)? =a + 0, + 7%, 
s ¢ 3 ££ ’ 
+ yo, + x2o,'+ s+ {14 0 X*.,..} — N X?.... 


—_— 2 2 2 2 (48) 
Hat Ke + Xo. + Lo + Xo, TH t Z's. 


Finally, the following hypothesis should be tested: 
Hs ° D; =0 ’ (49) 


i.e., the hypothesis that there is no significant difference between 
weights. Assuming that H; is true, we may write 


r= TVVZV(Kaje — A)?, (50) 
ot 9 8 
which reduces easily to 
Hr =TIUDVDV (Kis — X....) ? =D =z ~ X%ij¢ — N X?.... 
eigt si i 
= Hat H°o, + 770, + 0, 
+ yo + xi + ye + D4 0 X*..;.} — N X?.... 


I 


; (51) 
= 770 + Xo, +} Xo, + Xo, + Xo, +e, +, +2. 


Again, the additive property of the sum of squares is clearly 
demonstrated in equation (51). All the results may be summarized in 
Table 2. 


Now we come to our own problem which includes five variables: 
2 sexes x 2 sights x 7 weights x 4 rates x 2 dates. 


The mathematical expression of the D.L. value made by ¢t-th indi- 
vidual in s-th sex of i-th sight on j-th weight of k-th rate at I-th date 
is as follows: 


Xwimne = At+BLt+C,+D,+ E+ Fi t+ 1a + 1.5 + Ie t+ Ia 
+14;4+ 7x 4+ 7a +0 + 1+ Tin + [eis + Tei + Lit + Dein 
+ Tejt + Dena + Tin + Tin + Da + Dna + Dein + Dein 

+ Tein + Lena + Dinar + Doings + 2sinat , =? 


where the subscripts s, i, 7, k, l, and t¢ refer to sex, sight, weight, rate, 
date, and the particular individual, respectively; A is the grand mean 
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TABLE 2 
Analysis of Variance of D.L. Values on Different Weights of Different 
Sex and Different Sights 


























Source of Variation d.f. Sum of Squares* 
Error N—28 a—b 
Sex x sight x weight 6 b—c,—¢,—¢e,+4,+d,+4, —e 
Sex x sight 1 c,—d,—d+e 
Sex x weight 6 c¢, —d, —d, +e 
Sight x weight 6 c,—d,—d, +¢ 
Sex 1 d,—e 
Sight 1 d, —e 
Weight 6 d,—e 
Total N—-1 a—e 
* where 
sts cia 
b, = ZFFnM,,, } ? im 
, = Z2(Tn 2 ie} » Cp = 2 {2 n X28. ;.} : eB BAN X24, pS 
d,=2 (142...) , dp = 2(14nX2,..} , d= E(4nX2.,.} 5 
diem 7 
e=N X?.... 


of all individuals; B, C, D, E, and F are the measures of the main 
effects with respect to their own subscripts; I’s are the measures of 
the interactions with respect to their own subscripts: 2: jx; is the er- 
ror. So we have* 


1 grand mean; 
5 measures of main effects; 

10 measures of interactions of first order; 
10 measures of interactions of second order; 
5 measures of interactions of third order; 
1 measure of interaction of fourth order. 


Obviously, the numbers of the categories are similar to the co- 
efficients involved in the expansion of (a + b)*. If we go back to the 
earlier developments regarding two and three variables, we have ne 
same demonstration. For two variables, we have 


1 grand mean; 
2 measures of main effects; 
1 measure of interaction of first order. 


* First order refers to the interaction between two variables; second order re- 
fers to the interaction between three variables; and so on. 








140 PSYCHOMETRIKA 


For three variables, we have 


1 grand mean; 

3 measures of main effect; 

3 measures of interactions of first order; 
1 measure of interaction of second order. 


Hence, for » variables, we have 


1 grand mean; 
m measures of main effects; 
n(n—1) 





measures of interactions of first order ; 


iitie~a (n—2) 
3! 


measures of interactions of second order; 





n(n—1) (n—2) 
3! 
n(n—1) 


measures of interaction of (n—4)-th order; 





measures of interactions of (n—8)-th order; 





2! 
m measures of interactions of (n—2)-th order; 
1 measure of interaction of (n—1)-th order. 


Before we develop the equations necessary to our problem in particu- 
lar, we should always keep in mind that there are so many measures 
of interactions which should be taken into consideration. Usually the 
higher orders of interaction are not significant, so some students in- 
sist that those higher orders of interaction might be included in the 
error variance. However, it may be found that some of the higher 
orders of interaction are also significant and that the lower orders of 
interaction are not all significant. Therefore, we should first test 
every kind of interaction on the basis of error. Then we may put all 
the insignificant interactions in error term. This point has been ob- 
served completely in the analysis of our problem. 

The mathematical formulations for our problem can be obtained 
by following the same method as used before. In order to conserve 
space we simply summarize all the results in Table 3, in which we 


assume, 
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 <¢ 5 hs 4 
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a 4p oe 
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W=TEIE X?..92.}3 : 
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j= = sue Binwals = S 3182 Wil: “ac S =(56 DO iain¥s 
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& ¢ j k ju 


€19 = > {56 X?...e1} ; 
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g = 448 X?....... 
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TABLE 8 (continued) 




















Source of Variation | d.f. Sum of Squares 

1st order 
Sex x sight 1 ¢.—(f,+/%,) +9 
Sex x weight 6 e,— (f, + f,) +9 
Sex x rate 3 e,—(f, +f, +9 
Sex x date 1 @—(,+f) +9 
Sight x weight 6 e,— (fo + fs) +9 
Sight x rate 3 e.— (+f) +9 
Sight x date 1 e,—(f.+f,) +9 


Weight x rate 18 e.—(f, +h) +9 

















Weight x date 6 e-—(f, +f) +9 
Rate x date 3 e.o- i, ths) +9 

Sex 1 f; —_ 

Sight 1 i. —o 

Weight 6 f,-—9 

Rate 3 Fs aan. | 

Date 1 io 

TOTAL 447 a-—g 
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A REVIEW 


The Strong Vocational Interest Blank has made important history in the 
field of measurement. Since the appearance of the first form in 1927, the test has 
probably been the subject of more investigation and research than any other 
single test. And this research has been largely conducted or inspired by Strong 
himself. 

Vocational Interests of Men and Women is a comprehensive review of prob- 
blems in the field of the measurement of interests and a report of the prodigious 
researches of the writer with his Vocational Interest Blank, as well as of selected 
research by others bearing directly on his own. The emphasis is upon work since 
the publication of Fryer’s notable Measurement of Interests in 1981. After an 
introductory survey of the field, Strong takes up in turn the questions of differen- 
tiation of occupations, interest factors, guidance based on interests, the differ- 
entiation of superior and inferior members of a group, the differentiation of 
skilled tradesmen, and the construction and scoring of an interest inventory. 

The mass of evidence presented by Strong does not lend itself to summari- 
zation in as small a compass as a review. Suffice it to say that the field of the 
measurement of interests through use of the Strong Vocational Interest Blank 
is coveréd systematically, completely, and in an interesting style. The data are 
reported in an impartial vein; the con’s as well as the pro’s bearing upon the 
various procedures used by Strong are reported in fine good humor. The reader 
may not always agree with the conclusions reached by Strong, but in any case he 
must recognize that the book is a model of scientific reporting. No one can con- 
sider himself abreast of the interest field if he is not familiar with the contents 
of this book. 

One fundamental point, perhaps, deserves particular mention. Thurstone’s 
application of factor analysis to Strong’s interest scales in 1931 pointed to the 
possibility of a new technique in obtaining scores for various occupations. Since 
Thurstone demonstrated that the variance of the occupational scores could be 
almost entirely accounted for by four factors, it followed logically that formulas 
could be developed by multiple correlation for obtaining occupational scores from 
measures of these four factors or from any set of a few selected measures which 
would account for the variance of all the scales. The actual scoring could then 
be reduced to the fundamental measures used. Later research by Strong, using 
more scales, increased the number of factors involved, but the implication of the 
findings is the same. 

It has been a matter of surprise to some people in the field that Strong has 
not followed up this lead. Strong’s reasons for not doing so, as given in his book, 
are not convincing. The first and less important reason is that “occupational 
scores can be obtained by machine scoring with no less trouble than with calcu- 
lations involving multiple correlations.” I am inclined to doubt this assertion. 
Strong is comparing a streamlined scoring method done on a mass production 
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basis with a straight computing machine job of calculating scores using regres- 
sion weights. There is no reason why the application of a standard series of 
weights can not also be streamlined and done on a mass production basis. 
Strong’s other reason is by far the more important, since it concerns valid- 
ity. Strong cites Dwyer’s data to the effect that “the multiple correlation scores 
will correlate ‘.80 or better’ in only the majority of cases with the results ob- 
tained by use of occupational scales.” At this point we should stop to consider 
the purpose of the scales. If one particular scoring method is taken as the cri- 
terion, then we are doomed to labor in vain to develop a more valid method, for 
we can not hope to improve on what is implicitly assumed to be perfect to begin 
with. A more logical basis of judgment appears to be to determine which method 
is more successful in differentiating those in an occupation from men-in-general. 
It has not been demonstrated that such differentiation is accomplished better by 
scoring items than by assigning multiple regression weights* to a few selected 
measures which account for most of the interest variance in the occupations 
studied. Scores obtained by the latter procedure should be more valid than those 
obtained by the item-scoring method. Use of the item-scoring method carries no 
assurance that the cptimal weight is given to each factor, since the number of 
items representing each factor is not controlled and item intercorrelations are not 
allowed for in determining item weights. Theoretically, it is impossible for an 
occupational scale obtained by Strong’s method to differentiate better than weight- 
ed selected measures, if the measures used account for all the variance of the oc- 
cupational scales. There is no reason to believe that the theoretical indication 


will not be borne out by the evidence, when obtained. 
G. FREDERIC KUDER 


* It should be noted that these regression weights must be developed through 
use of the original data from the various occupational ups, and not by using 
the occupational scale scores as criteria in developing the formulas. 
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